Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development
https://llama2-accessory.readthedocs.io/
Other
2.71k stars 176 forks source link

ModuleNotFoundError: No module named 'fused_layer_norm_cuda' **, when i use in the project #8

Closed poonehmousavi closed 1 year ago

poonehmousavi commented 1 year ago

I have installed apex but when running the fine-tuning, I got this error: NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6

linziyi96 commented 1 year ago

Could you please provide the installation command you used? Please note that CUDA and CPP modules (including fused_layer_norm_cuda) in apex are only installed when explicitly specified in the command arguments (see https://github.com/nvidia/apex#linux).

poonehmousavi commented 1 year ago

I have used these commands inorder to install: git clone https://github.com/NVIDIA/apex cd apex pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

kriskrisliu commented 1 year ago

I faced the same error. This seems to be a CUDA compatibility issue. To resolve it, you can try the following steps.

  1. intall cuda 11.7 (download *.run file from nvidia)
  2. set PATH , LD_LIBRARY_PATH, CUDA_HOME correctly, for instance
    export PATH=/usr/local/cuda-11.7/bin/${PATH:+:${PATH}} #should put the cuda path as installed in your device
    export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
    export CUDA_HOME=/usr/local/cuda-11.7

    You can double-check by running:

    echo $PATH
    echo $LD_LIBRARY_PATH
    echo $CUDA_HOME
  3. go to apex and install with pip, the official instruction is helpful.
    # if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
    pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
Z-MU-Z commented 1 year ago

I faced the same error when I used 'pip', then I use python setup.py install --cuda_ext --cpp_ext, this work for me