NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/index.html
Apache License 2.0
1.85k stars 309 forks source link

AssertionError: CublasLt version 12.1.3.x or higher required for FP8 execution on Ada. #955

Closed saurabh-kataria closed 3 months ago

saurabh-kataria commented 3 months ago

What does this mean? My CUDA is 12.1. It is too old?

  File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/site-packages/transformer_engine/pytorch/fp8.py", line 560, in fp8_autocast
    FP8GlobalStateManager.fp8_autocast_enter(enabled=enabled,
  File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/site-packages/transformer_engine/pytorch/fp8.py", line 412, in fp8_autocast_enter
    assert fp8_available, reason_for_no_fp8
AssertionError: CublasLt version 12.1.3.x or higher required for FP8 execution on Ada.
[rank0]: Traceback (most recent call last):
[rank0]:   File "pretrain_iter.py", line 410, in <module>
[rank0]:     fire.Fire()   # enables easy commonda line interface
[rank0]:   File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/site-packages/fire/core.py", line 143, in Fire
[rank0]:     component_trace = _Fire(component, args, parsed_flag_args, context, name)
[rank0]:   File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/site-packages/fire/core.py", line 477, in _Fire
[rank0]:     component, remaining_args = _CallAndUpdateTrace(
[rank0]:   File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
[rank0]:     component = fn(*varargs, **kwargs)
[rank0]:   File "train.py", line 300, in do_pretrain
[rank0]:     out = model(x)
[rank0]:   File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/contextlib.py", line 74, in inner
[rank0]:     with self._recreate_cm():
[rank0]:   File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/contextlib.py", line 113, in __enter__
[rank0]:     return next(self.gen)
[rank0]:   File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/site-packages/transformer_engine/pytorch/fp8.py", line 560, in fp8_autocast
[rank0]:     FP8GlobalStateManager.fp8_autocast_enter(enabled=enabled,
[rank0]:   File "/home/uname/anaconda3/envs/tmp3/lib/python3.8/site-packages/transformer_engine/pytorch/fp8.py", line 412, in fp8_autocast_enter
[rank0]:     assert fp8_available, reason_for_no_fp8
[rank0]: AssertionError: CublasLt version 12.1.3.x or higher required for FP8 execution on Ada.
aurianer commented 3 months ago

What cublas version are you running? if you are running 12.1.0 it doesn't meet the requirement 12.1.3

saurabh-kataria commented 3 months ago

Let me try with newer CUDA. I think the issue is when searching for CUDA 12.1.x on NVIDIA website, it by default gives option for 12.1.0. It is a little tricky for the user to meet the 12.1.3 requirement. https://developer.nvidia.com/cuda-12-1-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Debian&target_version=11&target_type=runfile_local

P.S. CUDA 12.3 solved the issue. Thanks!