llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.61k stars 11.82k forks source link

Hardening checks should detect dereferencing std::list<T>::end() #80234

Open davidben opened 8 months ago

davidben commented 8 months ago

Like std::map in https://github.com/llvm/llvm-project/issues/80212#issuecomment-1920171595, dereferencing the end iterator of std::list seems to just read off the end of the std::list because the end node is embedded but doesn't have a value in it. See https://godbolt.org/z/11Mx8Yhh4

(CC @danakj)

davidben commented 8 months ago

Fixing this one may be tricky. Some ideas:

davidben commented 8 months ago

(std::forward_list also lacks checks, but that one's easy because end is just null.)

Edit: Scratch that. end() in std::forward_list is easy, but before_begin() has the same problem as here.

davidben commented 8 months ago

Thinking some more, probably the way to do it is like bounded iterators and make the iterator larger instead of the nodes. I.e. we make the iterator contain both a pointer and also a pointer to the end, and then refuse to operator++ and operator* when you're at the end.

That's probably the way to go, because the compiler can optimize that check out, assuming the iterator's whole lifetime can be inlined. (But bounded iterators assume the same thing.) Whereas anything around making the end node detectable is harder to optimize out because the compiler doesn't know the structure's invariants.

var-const commented 8 months ago

Thanks a lot for researching this! This was very informative to read. One curious thing I noticed is that it seems we can't check this without an ABI-breaking change. I'll have to think more about it, but just from reading, storing the extra information inside the iterator type seems like the preferable option.

Is this check of particular importance to Chromium or Google (i.e., is there a reason to prioritize it)?

davidben commented 8 months ago

Annoyingly, I no longer think sticking information in the iterator is feasible. See https://github.com/llvm/llvm-project/issues/80212#issuecomment-1930063010

As for importance, I was mostly filing bugs by looking at containers, so take this bug more as what happened to catch my eye than actual importance. :-)

I have no hard data, but my intuition would be that std::list is itself not very important but #80212 would actually be valuable. Calling find on a map is very very common, but libc++'s implementation strategy for maps means that forgetting to check against end() before dereferencing is an out-of-bounds read.

From there, I filed this bug about std::list just because it has the same structure. I.e. once we solve map, it's worth pondering how that applies to list, less from importance and more from similarity.