Open delverOne25 opened 1 month ago
Hi @delverOne25, thanks for your bug report.cd I'll take a look at it whether the problem still exists in 2.0beta1.
Regardless of the analysis results, I strongly advise against performing memory allocations on local arrays due to their significant negative impact on runtime performance. Instead, consider implementing a shared memory approach and involving multiple threads for swapping values through synchronization primitives such as locks or atomic operations. Another alternative is utilizing LocalMemory.Allocate
to enforce proper thread-local memory management, which may help mitigate this compilation issue. However, it's important to note that this might not completely eliminate performance penalties.
Describe the bug
NVIDIA GeForce RTX 4060 [Type: Cuda, WarpSize: 32, MaxNumThreadsPerGroup: 1024, MemorySize: 8585216000] Unhandled exception. ILGPU.InternalCompilerException: An internal compiler error has been detected ---> System.Collections.Generic.KeyNotFoundException: The given key 'arith.bin.Shl_2540: index_168026, const_2539 [None]' was not present in the dictionary. at System.Collections.Generic.Dictionary
2.get_Item(TKey key) at ILGPU.IR.Analyses.AnalysisValueMapping
1.get_Item(Value key) at ILGPU.IR.Analyses.ValueFixPointAnalysis2.ValueAnalysisContext.get_Item(Value valueNode) at ILGPU.IR.Analyses.ValueFixPointAnalysis
2.GenericValue[TContext](AnalysisValue1 source, Value value, TContext context) at ILGPU.IR.Analyses.ValueFixPointAnalysis
2.Merge[TContext](AnalysisValue1& source, Value value, TContext context) at ILGPU.IR.Analyses.ValueFixPointAnalysis
2.Update[TContext](Value node, TContext context) at ILGPU.IR.Analyses.ValueFixPointAnalysis2.<Analyze>g__ProcessBlock|9_0[TOrder,TBlockDirection](BasicBlock block, <>c__DisplayClass9_0
2&) at ILGPU.IR.Analyses.ValueFixPointAnalysis2.Analyze[TOrder,TBlockDirection](BasicBlockCollection
2& blocks, AnalysisValueMapping1 valueMapping, AnalysisReturnValueMapping
1 returnMapping)Environment