unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.69k stars 1.31k forks source link

Conda installation detailed instructions #73

Closed NasonZ closed 10 months ago

NasonZ commented 10 months ago

I'm trying to follow the instructions for installing unsloth in a conda environment, the problem is that the conda gets stuck when running the install lines.

I've tried running it twice, both times it got stuck solving the environment and I stopped after 30 minutes.

$ conda install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
Collecting package metadata (current_repodata.json): \ WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.7.1.*, but conda is ignoring the .* and treating it as 1.7.1
done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): - WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
done
Solving environment: | 

Additional system info:

$ nvidia-smi
Mon Jan  8 20:28:55 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    Off | 00000000:00:1E.0 Off |                    0 |
|  0%   28C    P8              16W / 300W |      4MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
danielhanchen commented 10 months ago

Do you have mamba?

Maybe try mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y

danielhanchen commented 10 months ago

Mamba can help solve long solving issues

NasonZ commented 10 months ago

No, I have a miniconda/anaconda which was installed via oobabooga.

(base) ubuntu@awsec2:~$ conda activate model_train_env
(model_train_env) ubuntu@awsec2:~$ mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
Command 'mamba' not found, did you mean:
  command 'samba' from deb samba (2:4.15.13+dfsg-0ubuntu1.5)
Try: sudo apt install <deb name>
danielhanchen commented 10 months ago

@NasonZ hmmmm another approach is to install it one by one and ignoring pytorch

conda install cudatoolkit xformers bitsandbytes -c nvidia -c xformers -c conda-forge
NasonZ commented 10 months ago

TLDR:

These are the steps I took to get my unsloth conda env working

$ conda create --name <your_unsloth_env> python=<3.10/3.9>

$ conda install pytorch torchvision torchaudio pytorch-cuda=<12.1/11.8> -c pytorch -c nvidia

$ conda install xformers -c xformers -y

$ pip install bitsandbytes

$ pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

So I tried installing one by one which raised I few issues which I was able to work around.

1) xformers needs python 3.9 or 3.10 (I had 3.11 as it wasn't specified what python version was needed in the readme.md)

(model_train_env) ubuntu@awsec2:~/dmyzer/dmyzer-data-generator$ conda install xformers -c xformers -y        
Collecting package metadata (current_repodata.json): done                                                             
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.           
Collecting package metadata (repodata.json): done                                                                     
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: -                                                                                                
Found conflicts! Looking for incompatible packages.                                                                   
This can take several minutes.  Press CTRL-C to abort.                                                                
failed                                                                                                                

UnsatisfiableError: The following specifications were found                                                           
to be incompatible with the existing python installation in your environment:                                         

Specifications:

  - xformers -> python[version='>=3.10,<3.11.0a0|>=3.9,<3.10.0a0']

Your python: python=3.11

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__cuda==12.2=0
  - feature:/linux-64::__glibc==2.35=0
  - feature:|@/linux-64::__glibc==2.35=0
  - python=3.11 -> libgcc-ng[version='>=11.2.0'] -> __glibc[version='>=2.17']
  - xformers -> pytorch=2.0.1 -> __cuda[version='>=11.8']

Your installed version is: 2.35

2) Installing cudatoolkit separately led to issues when installing pytorch after, cudatoolkit is installed by pytorch-cuda so specifying it separately was redundant in my case.

3) Installing bitsandbytes via conda install bitsandbytes -c conda-forge -y led to the same frozen solve issue outlined originally. Installing via conda install conda-forge::bitsandbytes also didn't work, bitsandbytes threw a load of errors when running from unsloth import FastLanguageModel. Eventually got it running by installing the method mentioned in the bitsandbytes repo - pip install bitsandbytes.

I verified that my enviornment was working by running the TinyLLama notebook.

danielhanchen commented 10 months ago

Oh my! Thanks so so much for the detailed instructions - I'll be pinning this if you don't mind :) Glad it finnaly was able to work!!

NasonZ commented 10 months ago

No worries, happy to help other get onboard with what looks to be a really useful package :)

findalexli commented 9 months ago

hi there, still getting the following error: Could not solve for environment specs The following packages are incompatible └─ xformers is installable with the potential options ├─ xformers [0.0.16|0.0.17|...|0.0.24] would require │ └─ python >=3.10,<3.11.0a0 , which can be installed; ├─ xformers [0.0.16|0.0.17|...|0.0.24] would require │ └─ python >=3.9,<3.10.0a0 , which can be installed; └─ xformers [0.0.16|0.0.20|0.0.21] conflicts with any installable versions previously reported. When running (unsloth) (base) ubuntu@ip-172-31-34-94:~$ mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y, I checked that I have coda 12.1 installed

Gene-Weaver commented 8 months ago

I ran into this error:

Exception has occurred: RuntimeError

        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

And used this combination of the approaches listed above to get things working:

conda create --name unsloth_env python=3.10 conda activate unsloth_env mamba install xformers pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y pip install bitsandbytes pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

felipepenhorate commented 8 months ago

Just a heads up to anyone that is trying to install the package on a miniconda env and getting error in the xformers installation because of conflicts, it turns out nowadays the conda is installing pytorch==2.2.1 that is not compatible with the xformers. You need to set the pytorch version to 2.2.0 in order to make the installation work properly.

This is what I used:

conda create --name unsloth_env python=3.10
conda activate unsloth_env
conda install pytorch==2.2.0 cudatoolkit torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install xformers -c xformers
pip install bitsandbytes
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"
NasonZ commented 8 months ago

@felipepenhorate Thanks for the quick fix, just encountered this issue with conda.

danielhanchen commented 8 months ago

@felipepenhorate Yes thanks so much! I'll update the readme!!

Gene-Weaver commented 8 months ago

Also, because of the triton package requirements, this only works on Linux systems (without compiling your own triton workaround 😬). You can train on Linux and then deploy on other systems using regular Hugging Face workflows. Thanks @danielhanchen!

ppaartha commented 6 months ago

TLDR:

These are the steps I took to get my unsloth conda env working

$ conda create --name <your_unsloth_env> python=<3.10/3.9>

$ conda install pytorch torchvision torchaudio pytorch-cuda=<12.1/11.8> -c pytorch -c nvidia

$ conda install xformers -c xformers -y

$ pip install bitsandbytes

$ pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

So I tried installing one by one which raised I few issues which I was able to work around.

  1. xformers needs python 3.9 or 3.10 (I had 3.11 as it wasn't specified what python version was needed in the readme.md)
(model_train_env) ubuntu@awsec2:~/dmyzer/dmyzer-data-generator$ conda install xformers -c xformers -y        
Collecting package metadata (current_repodata.json): done                                                             
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.           
Collecting package metadata (repodata.json): done                                                                     
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: -                                                                                                
Found conflicts! Looking for incompatible packages.                                                                   
This can take several minutes.  Press CTRL-C to abort.                                                                
failed                                                                                                                

UnsatisfiableError: The following specifications were found                                                           
to be incompatible with the existing python installation in your environment:                                         

Specifications:

  - xformers -> python[version='>=3.10,<3.11.0a0|>=3.9,<3.10.0a0']

Your python: python=3.11

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__cuda==12.2=0
  - feature:/linux-64::__glibc==2.35=0
  - feature:|@/linux-64::__glibc==2.35=0
  - python=3.11 -> libgcc-ng[version='>=11.2.0'] -> __glibc[version='>=2.17']
  - xformers -> pytorch=2.0.1 -> __cuda[version='>=11.8']

Your installed version is: 2.35
  1. Installing cudatoolkit separately led to issues when installing pytorch after, cudatoolkit is installed by pytorch-cuda so specifying it separately was redundant in my case.
  2. Installing bitsandbytes via conda install bitsandbytes -c conda-forge -y led to the same frozen solve issue outlined originally. Installing via conda install conda-forge::bitsandbytes also didn't work, bitsandbytes threw a load of errors when running from unsloth import FastLanguageModel. Eventually got it running by installing the method mentioned in the bitsandbytes repo - pip install bitsandbytes.

I verified that my enviornment was working by running the TinyLLama notebook.

tmp/tmpmemclhbv/main.c: In function ‘list_to_cuuint64_array’: /tmp/tmpmemclhbv/main.c:354:3: error: ‘for’ loop initial declarations are only allowed in C99 mode for (Py_ssize_t i = 0; i < len; i++) { ^ /tmp/tmpmemclhbv/main.c:354:3: note: use option -std=c99 or -std=gnu99 to compile your code /tmp/tmpmemclhbv/main.c: In function ‘list_to_cuuint32_array’: /tmp/tmpmemclhbv/main.c:365:3: error: ‘for’ loop initial declarations are only allowed in C99 mode for (Py_ssize_t i = 0; i < len; i++) {

subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmporgwe35u/main.c', '-O3', '-I/miniconda3/envs/LLM/lib/python3.10/site-packages/triton/common/../third_party/cuda/include', '-I/miniconda3/envs/LLM/include/python3.10', '-I/tmp/tmporgwe35u', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmporgwe35u/cuda_utils.cpython-310-x86_64-linux-gnu.so', '-L/lib64', '-L/lib', '-L/lib64', '-L/lib']' returned non-zero exit status 1. getting this error after trying every type of unsloth env setup. Got stuck in this issue.

danielhanchen commented 6 months ago

Oh maybe outdated gcc?

ppaartha commented 6 months ago

Oh maybe outdated gcc?

My gcc version is - gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44) Is it the actual problem?

danielhanchen commented 6 months ago

@felipepenhorate Ye I think that's wayyy too old!!

Lipapaldl commented 5 months ago
from triton.common.build import libcuda_dirs

ModuleNotFoundError: No module named 'triton.common'

danielhanchen commented 5 months ago

@Lipapaldl Is your Triton version 3.0.0?

ArrangingFear56 commented 4 months ago

I ran pip install xformers==0.0.24 to retain torch version as latest xformers require torch==2.3.0 and conda install xformers -c xformers doesn't seem to work anymore.

danielhanchen commented 4 months ago

I'm planning to write a better guide for conda installs in the near future

WasamiKirua commented 4 months ago

I ran pip install xformers==0.0.24 to retain torch version as latest xformers require torch==2.3.0 and conda install xformers -c xformers doesn't seem to work anymore.

oh my god. I'm trying hard to use unsloth locally but it's a pain. I follow the conda instructions but I've been forced to downgrade xformers, i"ve tried the version printed out in the error as well yours but no way it seems that the conflict which triggered the error is still there. I'm done

ArrangingFear56 commented 4 months ago

@WasamiKirua Are you running windows? If so, you may need to run it in WSL 2 instead, that’s what eventually worked for me.

richardxoldman commented 4 months ago
conda create --name unsloth_env python=3.10
conda activate unsloth_env
conda install pytorch==2.2.0 cudatoolkit torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install xformers==0.0.24
pip install bitsandbytes
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

It works for me (Linux)

danielhanchen commented 4 months ago

Apologies on Conda issues - I do know sometimes it can be painful - another option is to copy paste our Kaggle install instructions here: https://www.kaggle.com/danielhanchen/kaggle-gemma2-9b-unsloth-notebook which might work

Seneko commented 3 months ago

On Windows this works only on WSL"?

ArrangingFear56 commented 3 months ago

On Windows this works only on WSL"?

In my experience, yes. There would be unresolvable dependencies otherwise.

Seneko commented 3 months ago

Thanks @ArrangingFear56 Will try later in WSL. Currently fine tuning in colab.

danielhanchen commented 3 months ago

I did make the installation somewhat better in https://github.com/unslothai/unsloth?tab=readme-ov-file#-installation-instructions! Hope this makes things better!

OFouda1 commented 3 months ago

@NasonZ Thanks Thanks Thanks