RWilton / Arioc

Arioc: GPU-accelerated DNA short-read alignment
BSD 3-Clause "New" or "Revised" License
59 stars 8 forks source link

Hard-coded architecture #13

Closed r-barnes closed 3 years ago

r-barnes commented 3 years ago

The Linux makefile currently hardcodes the architecture:

-gencode arch=compute_37,code=\"compute_37,sm_37\"

This is problematic for two reasons:

  1. The 3.7 compute capability is deprecated in CUDA 11.
  2. More seriously, this doesn't offer forward compatibility with most GPUs (from the docs):

    The NVIDIA CUDA C++ compiler, nvcc, can be used to generate both architecture-specific cubin files and forward-compatible PTX versions of each kernel. Each cubin file targets a specific compute-capability version and is forward-compatible only with GPU architectures of the same major version number. For example, cubin files that target compute capability 3.0 are supported on all compute-capability 3.x (Kepler) devices but are not supported on compute-capability 5.x (Maxwell) or 6.x (Pascal) devices. For this reason, to ensure forward compatibility with GPU architectures introduced after the application has been released, it is recommended that all applications include PTX versions of their kernels.

RWilton commented 3 years ago

Thank you for describing your difficulty with building Arioc from source.

In fact, the target microarchitecture ("compute capability") is not "hard-coded" in the Linux makefile. It is specified in the makefile in a user-configurable variable named CUDA_CC, as you will see by examining the makefile and by following the example in the Arioc User Guide.

Furthermore, the Linux makefile, which by default specifies a "compute capability" of 3.7, does generate a valid executable that runs on devices with more recent microarchitectures.

Your comment suggests that the User Guide and the makefile are not sufficiently clear about how to compile Arioc to target a specific microarchitecture and how this might contribute to performance optimization. We will revisit that documentation and do our best to ensure that it is adequate.