JuliaGPU / Adapt.jl

Other
86 stars 24 forks source link

Revert "Use recursion to fix inference failure" #80

Closed maleadt closed 3 months ago

maleadt commented 3 months ago

Reverts JuliaGPU/Adapt.jl#78

Looks like this introduces inference crashes during CUDA.jl CI: from https://buildkite.com/julialang/gpuarrays-dot-jl/builds/814#018e13b6-4936-4fb9-8d88-4402694019e6

      From worker 4:    Internal error: stack overflow in type inference of _adapt_tuple_structure(CUDA.KernelAdaptor, NTuple{6373, UInt64}).
      From worker 4:    This might be caused by recursion over very long tuples or argument lists.

cc @charleskawczynski

codecov[bot] commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 93.65%. Comparing base (3d7097a) to head (79a98f2).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #80 +/- ## ========================================== - Coverage 94.02% 93.65% -0.38% ========================================== Files 6 6 Lines 67 63 -4 ========================================== - Hits 63 59 -4 Misses 4 4 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

charleskawczynski commented 3 months ago

It looks like the failure was due to an aqua ambiguity, not inference failure, can't we fix that? cc @maleadt

maleadt commented 3 months ago

The aqua failure is unrelated. It's the inference failures that are problematic.

charleskawczynski commented 3 months ago

Ah, I didn't see that. Sheesh Internal error: stack overflow in type inference of _adapt_tuple_structure(CUDA.KernelAdaptor, NTuple{7708, UInt64}). That seems awfully large, is that correct?

If so, is there a middle point that we could settle on? Maybe we can specialize on small tuples?

maleadt commented 3 months ago

Yeah, those large tuples are used to test for parameter space exhaustion: https://github.com/JuliaGPU/CUDA.jl/blob/cb14a637e0b7b7be9ae01005ea9bdcf79b320189/test/core/execution.jl#L622-L625

In any case, it would be good to add a limit based on the length of the tuple. Anything that's significantly large should probably fall back to the current implementation? Or maybe use ntuple (why doesn't that suffice in the first place to avoid an inference problem?).