Closed inducer closed 6 months ago
Ah OK. I've bumped down the version bound.
This had some truly mysterious failures, including (on i686):
lto1: out of memory allocating 65536 bytes after a total of 3825303552 bytes
make: *** [/tmp/_isl.cpython-38-i386-linux-gnu.so.vE1Riq.ltrans47.ltrans.o] Error 1
lto-wrapper: fatal error: make returned 2 exit status
compilation terminated.
/opt/rh/devtoolset-10/root/usr/libexec/gcc/i686-redhat-linux/10/ld: error: lto-wrapper failed
and (on macos-11
):
/Users/runner/work/islpy/islpy/src/wrapper/gen-wrap-part1.inc:55916:29: error: expected ')'
\7<A1>7<AC>䧊<C7><U+001F><8F>g|<U+0004><U+0007><F4>B{<F6>tlc<A9><F0><97>K<9D><AB><F3>m<F1><C0><BF><F5><8B><91><U+001B>/<F5><94>S<U+0008><8F>~<U+000C><D7>-<9C>[<BD><98>]>O<U+000F><98><C9>u셇
For that second one: I don't even. :thinking:
My wanting to eliminate LTO to help with the first failure led to a splitting of the ISL and wrapper compiles, as described in the nanobind cmake docs. @matthiasdiener I recall that this all was fairly finnicky to get working in the first place, and there were potentially significant performance impacts. Do you see any reason not to proceed with this?
LTO
and NOMINSIZE
did have a significant impact on the performance (see e.g. https://github.com/inducer/islpy/pull/116). Could we just disable LTO
for i686, and keep NOMINSIZE
enabled for all architectures?
@matthiasdiener This introduces more variables:
Previously, whatever we picked affected both. Now we can choose the desired build options for both more or less independently, so I am not sure the conclusions from #116 carry over neatly.
Testing it now
I see similar performance results as in #116 (M1, same microbenchmark as in #116):
LTO
and NOMINSIZE
): 57.1sLTO
and NOMINSIZE
): 45.6sLTO
and NOMINSIZE
gets us about halfway there: 50.5sLTO
, but with NOMINSIZE
): 49.4sA somewhat controversial suggestion: do we really need to build an i686 version on cibw?
Thanks for checking! It looks like there's no easy win here. I've disable i686 wheel builds, not a big loss.
I sure hope that mac failure doesn't come back.
Thank you for this fix ! Is a release to be expected soon ?
https://github.com/inducer/islpy/releases/tag/v2024.1. That should wind its way to the package index and conda forge before too long.
https://github.com/inducer/islpy/releases/tag/v2024.1. That should wind its way to the package index and conda forge before too long.
Amazing, thanks :)
That
is_arithmetic
struct seems to be working properly for enums since nanobind 1.3.0 https://github.com/wjakob/nanobind/commit/4bfba7472437e1077497b215f167e3a986b0fbc5 so the bound onnanobind>=2
doesn't seem necessary. Does it not work on older versions? (context: I'm mostly complaining because Arch is still at 1.9.2, but it should update in a few days, so no big worries)