Closed liqiangxl closed 4 weeks ago
Can be fixed by adding a function to check whether the val is a pointer.
auto getTypeOrIndexType = [](Val* value){
if (auto ti = dynamic_cast<kir::TensorIndex*>(value)) {
if (isPointerType(ti->index()->dtype())) {
return ti->index()->dtype();
}
}
return value->dtype();
};
After change:
asm volatile(
"{\n"
" .reg .pred p0; \n"
" setp.ne.b32 p0, %3, 0;\n"
" cp.async.ca.shared.global [%0], [%1], %2, p0;\n"
"}\n"
:
:"r"((uint32_t)((toSmem(T1) + i0))),
"l"(((T0.data + i0) + i1)),
"n"(4LL),
"r"((uint32_t)((!b3)))
);
Test
DistributedTransformerTest.Backward/__bfloat
has bool type tensor, when shared memory persistent is used with async copy, it triggers a bug. A mini-reproduce is as follows:The generated code is:
Durint lowering, the pointer to bool is processed as bool and being converted to
uint32_t
.