Closed jacobhinkle closed 2 months ago
The failure in PipelineTestStagedReduction.StagedReduction
is a duplicate of #2257. It only happens with -np 1
not multiple GPUs, so CI hasn't complaint about it.
What's the problem with PipelineTestStagedReduction.StagedReduction/Automatic
?
duplicate of #2257
Ah thanks! I somehow missed that before filing this issue. Yes the Manual
failure is definitely the same as #2257. The Automatic
failure is a failure to segment: a segment with a single expression that's a SegmentSet
is proposed and none of the schedulers accepts it.
**Segmenter** Considering fusion:
T3_g[ bS10{1}, iS11{8} ] (DeviceMesh{0})
= SegmenterSet( T1_g[ bdeviceIdx.x3{1}, iS4{8}, rS5{64} ] (DeviceMesh{0}) )
Scheduler _expr_eval_ ***rejected*** because : Fusion is resharding.
Scheduler _no_op_ ***rejected*** because : output has a concrete dimension
Scheduler _matmul_ ***rejected*** because : No matmul patterns were found
Scheduler _reduction_ ***rejected*** because : Fusion is resharding.
Scheduler _transpose_ ***rejected*** because : Fusion is resharding.
Scheduler _pointwise_ ***rejected*** because : Fusion is resharding.
Scheduler _inner_persistent_ ***rejected*** because : Fusion is resharding.
Scheduler _outer_persistent_ ***rejected*** because : Fusion is resharding.
Scheduler _inner_outer_persistent_ ***rejected*** because : Fusion is resharding.
unknown file: Failure
Currently the test
PipelineTestStagedReduction.StagedReduction
is failing. Seewhich fails to compile the generated kernel:
This fails with
I think this is the signature we're targeting in this case, but we have a
volatile float
for the first argument: https://github.com/NVIDIA/Fuser/blob/6dba9a837deb14b82bb87db8c8e2a07fb02cad60/runtime/block_reduction.cu#L163-L177The other case,
PipelineTestStagedReduction.StagedReduction/Automatic
fails to segment this fusion:That seems like a separate issue but might indicate a common issue with this test.