AnyDSL / thorin

The Higher-Order Intermediate Representation
https://anydsl.github.io
GNU Lesser General Public License v3.0
151 stars 15 forks source link

JIT: Multiple `compile` of different code crashes thorin #134

Closed PearCoding closed 1 year ago

PearCoding commented 1 year ago

Compiling two or more separate artic sources in JIT sometimes triggers the following assert:

src/thorin/util/hash.h:421: void thorin::detail::HashTable<Key, T, H, StackCapacity>::rehash(size_t) [with Key = const thorin::Def*; T = void; H = thorin::World::SeaHash; long unsigned int StackCapacity = 4; size_t = long unsigned int]: Assertion `is_power_of_2(new_capacity)' failed.

The rehash function is called with new_capacity=0 with the following stack trace stacktrace.txt

If the whole process is repeated again the error appears to be fixed. This is due to the cache ensuring the first source code is NOT compiled but only loaded again. The error is not triggered for all combinations of source code, but can be repeated (if the cache is disabled) with the same code combination.

I assume it is a static variable messing with something and not a memory space issue, as the memory usage is below 150Mb at the time of the event, which is way below the available 64Gb my computer has.

Note: Both sources make use of CUDA kernels. Note2: The first source is significantly smaller than the second source code. Note3: Both code make usage of C functions outside the CUDA kernels. Note4: The JIT process is NOT run in multiple threads, as the whole compile process is not thread safe. Note5: The issue might be related to artic or the runtime. But the error got triggered in thorin, that is why it is posted here.

I will try to single out the issue and construct a small standalone example as the current issue is triggered in the Ignis framework.

PearCoding commented 1 year ago

This is indeed an Artic bug and will be fixed with the above PR.

PearCoding commented 1 year ago

Closed with AnyDSL/artic#16