aws / aws-ofi-nccl

This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.
Apache License 2.0
140 stars 54 forks source link

configure: remove builtin check #453

Closed aws-nslick closed 3 months ago

aws-nslick commented 3 months ago

this was previously broken under eg: clang-19 with the following error:

> 1 warning generated.
> configure:10893: checking if __builtin_expect is available
> configure:10909: /usr/bin/clang-19 -o conftest conftest.c -lm -lpthread -ldl  >&5
> conftest.c:36:7: error: call to undeclared library function 'exit'
> with type 'void (int) __attribute__((noreturn))'; ISO C99 and later do
> not support implicit function declarations
> [-Wimplicit-function-declaration]
>    36 |       exit(1)
>       |       ^
> conftest.c:36:7: note: include the header <stdlib.h> or explicitly
provide a declaration for 'exit'

And
> conftest.c:35:5: warning: ignoring return value of function declared
> with const attribute [-Wunused-value]
>    35 |     __builtin_expect(0, 0),
>       |     ^~~~~~~~~~~~~~~~ ~~~~

stop checking it in configure, and let the compiler fail if it's going to. There is no realistic compilers on any modern system where our usage of a compiler builtin actually fails the build, but there are clearly realistic compilers where our efforts to catch this in autotools prevents the build from being successful.

also re-enable clang-19 builds in gh actions.

bwbarrett commented 3 months ago

Thinking about this overnight, I think I'd just remove the builtin checks entirely. While I disagree with the statements that we should just let the build fail, I agree that there's no compiler that is going to be used to compile this that doesn't have these builtins. So just removing the checks seems reasonable to me.

aws-nslick commented 3 months ago

Thinking about this overnight, I think I'd just remove the builtin checks entirely. While I disagree with the statements that we should just let the build fail, I agree that there's no compiler that is going to be used to compile this that doesn't have these builtins. So just removing the checks seems reasonable to me.

This was my preference initially, but I spent the time debugging it to fix it, so let's just go forward with fix.

edit: I re-reviewed the case we discussed in person CHECK_GCC_BUILTIN([__builtin_nonexistent]) at the request of eraut@ and saw that the previous revision didn't actually address that like I thought it did, so we're back at square one, and I've reset this branch to the initial removal commit.

Thanks.