Open hillct opened 1 year ago
the resul is identical when specifying architecture using either of -D CUDA_ARCHITECTURES=87 or -D CMAKE_CUDA_ARCHITECTURES=87
FasterTransformer does not support Orin now.
When can we expect this to be addressed? Is there a roadmap we should be referring to in this regard?
Aside from the build system behavior (which should a least report unsupported architecture a build time rather than ignoring the input, allowing a full (seemingly successful) build and relying on he JIT compiler to report at runtime:
[FT][ERROR] CUDA runtime error: the provided PTX was compiled with an unsupported toolchain.
It seems the Triton Server component has a build for AGX Orin or at least Jetpack 5.1 https://github.com/triton-inference-server/server/releases/tag/v2.31.0 alhough the corresponding tritonserver docker image includes cuda runtime version and oher componants that are specifically not compaible with Jepack 5.1, where the compatibility drag seems to be the CUDA Driver 520 which is clearly due for an update.
OK. So 8.6 (A100) is the highest compute capability we can compile against. What is the lowest compute capability we can compile against? I had a potential vendor offer me an environment involving GPU compute capability 3.7. Is is even worth trying, or is this a waste of my time?
The lowest compute capability we have tested is 6.0. We cannot guarantee it works on lower compute capability.
Description
Reproduced Steps