weigao95 / surfelwarp

SurfelWarp: Efficient Non-Volumetric Dynamic Reconstruction
https://sites.google.com/view/surfelwarp/home
BSD 3-Clause "New" or "Revised" License
276 stars 71 forks source link

Compiling failed in the environment of (cuda11.3, arch_sm=86) #83

Open henry123-boy opened 1 year ago

henry123-boy commented 1 year ago

Hi! Thank you for your excellent works and codes, but recently I confronted some compiling problems in my server whose environment is cuda11.3 and arch_sm=86. The issues are reported as below: "ptxas /tmp/tmpxft_0006c5ba_00000000-6_block6x6_pcg_weber.ptx, line 4136; error : Instruction 'shfl' without '.sync' is not supported on .target sm_70 and higher from PTX ISA version 6.4" wish to get reply ~

henry123-boy commented 1 year ago

Update: I have solved this problem by replace

asm volatile (
        "{.reg .f32 r0;"
                ".reg .pred p;"
                "shfl.up.b32 r0|p, %1, %2, 0;"
                "@p add.f32 r0, r0, %1;"
                "mov.f32 %0, r0;}"
        : "=f"(result) : "f"(x), "r"(offset));

by

        asm volatile (
        "{.reg .f32 r0;"
                ".reg .pred p;"
                "shfl.sync.up.b32 r0|p, %1, %2, 0, -1;"
                "@p add.f32 r0, r0, %1;"
                "mov.f32 %0, r0;}"
        : "=f"(result) : "f"(x), "r"(offset));

where the shfl instruction is not supported by PTX higher than 6.4

ghost commented 4 months ago

Hi, now I am confronting the same problem, could you give me some instruction where to modify the inline assembly code