Immediate-Mode-UI / Nuklear

A single-header ANSI C immediate mode cross-platform GUI library
https://immediate-mode-ui.github.io/Nuklear/doc/index.html
Other
9.08k stars 544 forks source link

Macros NK_OFFSETOF, NK_CONTAINER_OF and NK_ALIGNOF rely on undefined behavior #94

Open rivten opened 4 years ago

rivten commented 4 years ago

When compiling with clang-9 with the option -fsanitize=undefined the basic nuklear example, I get the runtime errors :

../code/nuklear.h:19196:30: runtime error: member access within null pointer of type 'union nk_page_data' ../code/nuklear.h:19197:34: runtime error: member access within null pointer of type 'struct nk_page_element'

Indeed, the macros NK_OFFSETOF, NK_CONTAINER_OF, NK_ALIGNOF all dereference a null pointer. One possibility might be to have a NK_NO_UB flag or something so that people who want to catch UB errors don't get them when using nuklear.

exchg commented 4 years ago

Indeed, the macros NK_OFFSETOF, NK_CONTAINER_OF, NK_ALIGNOF all dereference a null pointer.

Nope. Dereferencnig assumes that you will read or write some data under the pointer. Currently here is pointer arithmetic only.

rivten commented 4 years ago

Nope. Dereferencnig assumes that you will read or write some data under the pointer. Currently here is pointer arithmetic only.

Fair enough. Still, there is an issue somewhere. I tried to modify the macros locally to use offsetof in stddef.h and change some arithmetic, and then clang does not detect any issue anymore.

Here are my new macros :

#define NK_OFFSETOF(st,m) ((nk_ptr)offsetof(st,m)) #define NK_CONTAINER_OF(ptr,type,member) (type*)((void*)((char*)(1 ? (ptr) : (ptr) - NK_OFFSETOF(type,member) (not completely sure about this one) #define NK_ALIGNOF(t) NK_OFFSETOF(struct {char c; t _h;}, _h)

With those, I don't get any runtime error. So either clang-9 is reporting false positive (but they report that false positive are unlikely), or either something is fishy with the code. In any case, other users might hit the same issues as I did, so it's probably good to be aware of that.

exchg commented 4 years ago

So either clang-9 is reporting false positive or either something is fishy with the code.

Both are possible

Common offsetof implementation looks like:

#define offsetof(st, m) \
    ((size_t)&(((st *)0)->m))

#define offsetof(st, m) \
    ((size_t)((char *)&((st *)0)->m - (char *)0))

#define offsetof(st, m) \
    __builtin_offsetof(st, m)

So i can assume that clang use __builtin_offsetof() and other "places" he treats as potential UB.

so it's probably good to be aware of that.

Definitely

dumblob commented 4 years ago

If it's how @exchg sketched out, then we can't do much about this. Shall we contact clang team to reconsider how they perform the sanitization?

exchg commented 4 years ago

@rivten This is canonical container_of implementation, can you check it with your clang compiler?

#define OFFSETOF(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

#define CONTAINER_OF(ptr, type, member) ({            \       
 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \    
 (type *)( (char *)__mptr - OFFSETOF(type,member) );})        

Current implementation is a little shady for me. I can't catch why ternary operator is here.

#define NK_CONTAINER_OF(ptr,type,member)\                                                  
    (type*)((void*)((char*)(1 ? (ptr) : &((type*)0)->member) - NK_OFFSETOF(type, member))) 
rivten commented 4 years ago

Hmmm for some reason, the CONTAINER_OF macro does not compile out-of-the-box for me on clang... It complained about a missing ')' at the location of the 0. I tried to fix it without success, so I can't test it right now. I'm not sure at all what's going on. Maybe if I've got time I'll try again today or this week.

Also I could try to setup a very minimal example public repo with everything prepared so you just have to build and run it to repro the error detection. Would that be valuable for you ?

exchg commented 4 years ago

Also I could try to setup a very minimal example public repo with everything prepared so you just have to build and run it to repro the error detection.

Yep, we can try this.

rivten commented 4 years ago

Sorry for answering late. I finally setup a quick repo here. Just a matter of git pull and sh build.sh on Linux should do it. With clang 6.0.0 I get the repro when running a.out.

Also, I discovered that the CONTAINEROF macro that @exchg does not seem to compile in pure C99, I had to changed the compilation flags to -std=gnu11 in order to have it compile (maybe because typeof is not in pure standard C ?), and even with the CONTAINEROF macro, I still have the repro :(

Sooo. At this point I'm really not sure. Since, on my side, I need to compile with sanitize flags and trap on, I just went with my own macro version there (that I posted in this thread) which seem to be working.

EDIT : also after a quick and dirty google search I found out something that might be related : https://stackoverflow.com/questions/10269685/kernels-container-of-any-way-to-make-it-iso-conforming

exchg commented 4 years ago

@rivten checked your repo and met no problem, clang-9.0.1. Maybe you can provide some additional info?

macro that @exchg does not seem to compile in pure C99,

Yep. You can try something like this one:

#define container_of(ptr, type, member) \
  ((type *)((char *)(ptr)-(char *)(&((type *)0)->member)))

This is simple version without type checking.

rivten commented 4 years ago

Hmm that's weird. Maybe I wasn't exactly clear : the issue happens when running the build, not when building it. Here's what I have.

hugo@hugo-S550CB:~/dev$ git clone https://github.com/rivten/nuklear_ub
Clonage dans 'nuklear_ub'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 5 (delta 0), reused 5 (delta 0), pack-reused 0
Dépaquetage des objets: 100% (5/5), fait.
hugo@hugo-S550CB:~/dev$ cd nuklear_ub/
hugo@hugo-S550CB:~/dev/nuklear_ub$ sh build.sh
hugo@hugo-S550CB:~/dev/nuklear_ub$ ./a.out 
nuklear.h:19183:30: runtime error: member access within null pointer of type 'union nk_page_data'
nuklear.h:19184:34: runtime error: member access within null pointer of type 'struct nk_page_element'
hugo@hugo-S550CB:~/dev/nuklear_ub$ clang --version
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

In particular, the two runtime errors when running a.out.

I still get the runtime errors with your last container_of macro. This is starting to make me very curious also. I might try to have a look at the generated assembly when I've got some time.

rivten commented 4 years ago

Ok so I tried to boil this do to even a more minimal example. So you can find this minimal example in this gist.

Compiling and running gives no UB when using the __builtin_offsetof macro (but maybe this is specific for the clang compiler so I guess there should be a test on the compiler type) and a UB when using the "classic" offsetof macro.

After deeper investigation, it seems even people at Intel think the classic definition of offsetof macro causes undefined behavior

Even Facebook changed their code to stop using their own offsetof implementation and switched to __builtin_offsetof in their clang code.

I'm not sure what to make of it. For clang and GCC, __builtin_offset seems to be available. I have no idea about msvc.

exchg commented 4 years ago

Compiling and running gives no UB when using the __builtin_offsetof macro

Yep, i can confirm it.

but maybe this is specific for the clang compiler so I guess there should be a test on the compiler type

Maybe. Because gcc gives no warnings with NK_CONTAINER_OF (i hope it is correct usage of ub sanitizer):

gcc nuklear_ub.c -o nuklear_ub -fsanitize=undefined -lm
rivten commented 4 years ago

Yep. I just tested with gcc and did not repro either. At this point, I feel like the only thing to do is to mail the clang team with my gist and see what their comment is on this.

Hejsil commented 4 years ago

I'm gonna chip in here. After reading @rivten link from the intel folk it really seems like the C standard would define this as undefined behavior. It is therefor correct for the sanitizer to trigger here.

I looked through the Nuklear codebase and it already acknowledges this in the NK_ALIGNOF macro. I think this is something that needs to be solved in Nuklear, either using __builtin_offsetof if available or comming up with a portable alternative that does not rely on the &((char*)0)-> trick.

I might look into this myself at some point when I get the time if no one else want to.

rootkea commented 2 years ago

Has this issue been fixed by #309?

If yes then maybe we should close this issue.

RobLoach commented 5 months ago

When compiling with emscripten, I'm getting the following...

nuklear.h:8974:38: error: defining a type within '__builtin_offsetof' is a Clang extension [-Werror,-Wgnu-offsetof-extensions]
 8974 |     NK_STORAGE const nk_size align = NK_ALIGNOF(struct nk_command);
      |                                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~nuklear.h:5821:35: note: expanded from macro 'NK_ALIGNOF'
 5821 | #define NK_ALIGNOF(t) NK_OFFSETOF(struct {char c; t _h;}, _h)
      |                                   ^~~~~~nuklear.h:5804:47: note: expanded from macro 'NK_OFFSETOF'
 5804 | #define NK_OFFSETOF(st,m) (__builtin_offsetof(st,m))
      |                                               ^~
nuklear.h:19465:42: error: defining a type within '__builtin_offsetof' is a Clang extension [-Werror,-Wgnu-offsetof-extensions]
 19465 |         NK_STORAGE const nk_size align = NK_ALIGNOF(struct nk_page_element);
       |                                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~nuklear.h:5821:35: note: expanded from macro 'NK_ALIGNOF'
 5821 | #define NK_ALIGNOF(t) NK_OFFSETOF(struct {char c; t _h;}, _h)
      |                                   ^~~~~~nuklear.h:5804:47: note: expanded from macro 'NK_OFFSETOF'
 5804 | #define NK_OFFSETOF(st,m) (__builtin_offsetof(st,m))
      |                                               ^~