StrataSource / vtex2

A VTF converter and editor
MIT License
42 stars 8 forks source link

fix: don't rely on lifetime UB in Image::load #38

Closed CounterPillow closed 10 months ago

CounterPillow commented 10 months ago

Calling load with path.string().c_str() assumes that the result of path.string() has not been freed by the time execution reaches the location where the result of c_str() is needed, as c_str()'s result's lifetime is tied to the lifetime of the string. The allocator has no reason to not free the string by then though, which results in undefined behaviour.

Fix this by placing the result of path.string() into a local variable, which is kept in memory until the stack frame is left.

craftablescience commented 10 months ago

I didnt think this was necessary and then I read the standard

JJL772 commented 10 months ago

@craftablescience Could you explain please? The standard says this about temporary object lifetimes:

All temporary objects are destroyed as the last step in evaluating the full-expression that (lexically) contains the point where they were created, and if multiple temporary objects were created, they are destroyed in the order opposite to the order of creation. This is true even if that evaluation ends in throwing an exception.

In this case, the full expression would be load(path.string().c_str()), no? It seems to behave as expected in godbolt: https://godbolt.org/z/zGfWYbfsd

craftablescience commented 10 months ago

nvm then thats what i thought looking at it at first i was reading the part where it said you have to bind a reference to a temporary to get it to live longer

CounterPillow commented 10 months ago

It's good that it works for you in godbolt but this fixes path becoming garbled for me on gcc 13.2.1. Looking for UB in random code isn't one of my hobbies, this is a real bug.

JJL772 commented 10 months ago

What command are you running to get garbled paths?

CounterPillow commented 10 months ago

./vtex2 convert image.png. Says it can't load the image.

On an up-to-date Arch Linux x86_64 machine.

This made me single-step through the code with gdb because I suspected stbi was being bad again but as it turns out it was innocent this time and gdb told me that path points to a non-readable memory region 0x20 in load, which this PR fixes.

JJL772 commented 10 months ago

I'm not able to reproduce this on Debian 13 with GCC 13.2.0 on x86_64, with either debug or release builds. What is the format of the PNG (i.e. 8-bit or 16-bit depth, dimensions, RGB or RGBA, etc.)?

CounterPillow commented 10 months ago

8-bit RGBA, non-interlaced. But it's not the PNG that's the problem since I already debugged this and it fails because the path C string is pointing to invalid memory.

JJL772 commented 10 months ago

Have you tried running this in debug, with either -O0 or -Og for optimization? This sounds like a compiler bug. I will try this with GCC 13.2.1 at some point soon.

CounterPillow commented 10 months ago

Okay I've tried it again from a fresh build with just -DCMAKE_BUILD_TYPE=RelWithDebInfo and I can no longer reproduce it. Some other people I've consulted also told me that sanitizers aren't being triggered by this, so this may have been a skill issue on my part. I do not understand how I've managed to make the path point to invalid memory, but I can no longer get it to do that.

JJL772 commented 10 months ago

Compilers are dark magic, so who knows TBH. If you were debugging a release build w/o debug info before, GDB could've been lying about where the address was located. That method gets inlined anyway, could make things more confusing.

I appreciate the PR nonetheless! I'll probably get clang-tidy set up in CI to catch cases of UB or other bad code when they do happen.

q66 commented 10 months ago

funnily, i always thought that C++ did not guarantee this, so i always wrote stuff in a way that any std::string as an explicit lvalue for as long as any .c_str() is being used (i'm probably going to continue doing that, as generally i like to make the intent obvious)

but after researching it a little, it seems it's indeed the case that no destructors for temporaries run until the whole outer expression has run...