NVIDIA-Genomics-Research / GenomeWorks

SDK for GPU accelerated genome assembly and analysis
https://clara-parabricks.github.io/GenomeWorks/
Apache License 2.0
281 stars 76 forks source link

[cudapoa/cudaaligner] fix compute version to 60 #566

Closed tijyojwad closed 3 years ago

tijyojwad commented 3 years ago

Because of the perf issue observed in cudapoa and cudaaligner, the max compute version that gives best numbers is compute 60. Update the nvcc flags for cudapoa and cudaaligner to compile to compute 60 only. Accordingly, update GW readme to only support architectures beyond Pascal.

r-mafi commented 3 years ago

@tijyojwad as you pointed, the max compute version that disables yield instructions (culprit for regression) is compute_60. I just ran benchmark_cudapoa with -arch=compute_35 and -arch=compute_60 and the performance remained the same. Although this is by no means a through benchmarking. I was wondering if supporting only architectures beyond Pascal is a serious limitation? otherwise it makes sense to switch to compute_60.

tijyojwad commented 3 years ago

@r-mafi , yes it's okay to update to compute_60 only. For the next release we were thinking of dropping support for gpus < pascal.