Closed renderist closed 1 year ago
yes, you are right. we will fix in the next release. thank you.
Just curious, is that function being called at all in your tests?
On Nov 7, 2022, at 8:51 PM, Haicheng Wu @.***> wrote:
yes, you are right. we will fix in the next release. thank you.
— Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/cutlass/issues/690#issuecomment-1306631254, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABI2FODQJOPCZEGDQ56IZPLWHHL5VANCNFSM6AAAAAARZ3X4TQ. You are receiving this because you authored the thread.Message ID: @.***>
no, this one is not needed in real kernels. so it is not tested.
Fixed in 2.11
Describe the bug On line 454 of include/cutlass/arch/memory.h, the asm instruction ld.shared.v4.u32 looks like it should be a std.shared.v4.u32. Appears to be a copy and paste error.
Steps/Code to reproduce bug Visual inspection, the function is shared_store<16>(), so would expect it to have a store instruction instead of a load instruction.
Additional context Appears in current Public facing GitHub repository.