chaidiscovery / chai-lab

Chai-1, SOTA model for biomolecular structure prediction
https://www.chaidiscovery.com
Other
1.01k stars 131 forks source link

CPU only #19

Open stianale opened 1 week ago

stianale commented 1 week ago

Hi, congratulations with this nice model

I am wondering if it is possible to run the model using CPU only and if this has been tested?

navvye commented 1 week ago

Was wondering this as well. I think the problem would be that the model would be too slow to run on a normal (consumer grade) CPU.

arogozhnikov commented 1 week ago

and if this has been tested?

it wasn't :)

I think the problem would be that the model would be too slow to run on a normal (consumer grade) CPU.

That's my expectation as well - it would be faster/cheaper to rent A100 in cloud rather than using CPU (even server CPU).

stianale commented 1 week ago

I see. :/ However, RoseTTAFold2NA finishes a run in about 10 mins for me using CPU only on my desktop, having AMD Cezanne, so I have a hard time seeing why Chai-1 would struggle so much more?

guruace commented 1 week ago

Thank Alex for the great work! I have tried on a single protein with 359 aa, and a ligand with formula "
C88H138N8O65" and molecular weight 2347. It took me 2 hrs and 40 min, I think it may be something wrong taking such a long time. During the running, I checked with nvidia-smi and GPU usage was about 17%, and VRAM usage was only 1.4 G out of total 48 G. I think it must be something going wrong, no?

arogozhnikov commented 1 week ago

Hi Robert, that sounds insanely long, and yes, I'd expect higher mem consumption.

  1. Please share your setup (python, pytorch, GPU).
  2. Other thing to check: did you change params, maybe too many trunk iterations?
  3. Also, we report time taken by trunk and denoising, does it take too long?
guruace commented 1 week ago

Dear Alex, Here is my setting-up:

  1. conda create -n chai-lab python=3.11
  2. pip install git+https://github.com/chaidiscovery/chai-lab.git
  3. python >>> import torch, error
  4. conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
  5. python >>> import torch; >>>torch.cuda.is_available(), ok
  6. python examples/predict_structure.py, ok for your fasta content. It took about a few minutes.
  7. python examples/predict_structure.py with my 359 aa protein and oligosialic acid (DP8). It took 2 hrs 40 min on my 13900K CPU, 64G main memory, one NVIDIA GA102GL RTX A6000 GPU, Ubuntu 22.04.5. I am now running with oligosialic acid DP12, and will expect to take more than 3 hrs!
  8. I did not change any of your params. Only place I changed was protein sequence and the ligand smiles (converted from pdb to SMILES by cheminfo.org)
guruace commented 1 week ago

This is screenshot after running 50 mins (python examples/predict_structure.py) after_50_minutes

This is GPU usage:

GPU -usage

stianale commented 1 week ago

Dear Alex, Here is my setting-up:

1. conda create -n chai-lab python=3.11

2. pip install git+https://github.com/chaidiscovery/chai-lab.git

3. python >>> import torch, error

4. conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia

5. python >>> import torch; >>>torch.cuda.is_available(), ok

6. python example/predict_structure.py, ok for your fasta content. It took about a few minutes.

7. python example/predict_structure.py with my 359 aa protein and oligosialic acid (DP8). It took 2 hrs 40 min on my 13900K CPU, 64G main memory, one NVIDIA GA102GL RTX A6000 GPU, Ubuntu 22.04.5. I am now running with oligosialic acid DP12, and will expect to take more than 3 hrs!

8. I did not change any of your params. Only place I changed was protein sequence and the ligand smiles (converted from pdb to SMILES by cheminfo.org)

Giving up on CPU for now, I tried Colab (free version), but it seems to interrupt when I try to run the model

!git clone https://github.com/chaidiscovery/chai-lab.git

%cd chai-lab/

!pip install git+https://github.com/chaidiscovery/chai-lab.git

!python examples/predict_structure.py

/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. self.pid = os.fork() /usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaces was not set. It will be set to True by default. This behavior will be depracted in transformers v4.45, and will be then set to False by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 warnings.warn( Loading checkpoint shards: 100% 2/2 [00:00<00:00, 4.18it/s] ^C

arogozhnikov commented 1 week ago

@guruace

Dear Alex, Here is my setting-up:

  • conda create -n chai-lab python=3.11
  • pip install git+https://github.com/chaidiscovery/chai-lab.git
  • python >>> import torch, error
  • conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
  • python >>> import torch; >>>torch.cuda.is_available(), ok
  • python examples/predict_structure.py, ok for your fasta content. It took about a few minutes.
  • python examples/predict_structure.py with my 359 aa protein and oligosialic acid (DP8). It took 2 hrs 40 min on my 13900K CPU, 64G main memory, one NVIDIA GA102GL RTX A6000 GPU, Ubuntu 22.04.5. I am now running with oligosialic acid DP12, and will expect to take more than 3 hrs!
  • I did not change any of your params. Only place I changed was protein sequence and the ligand smiles (converted from pdb to SMILES by cheminfo.org)

Thanks for details, that's very helpful.

arogozhnikov commented 1 week ago

@stianale did you use T4 or A100? T4 won't work (ESM can't fit, and no bfloat16 support)

stianale commented 1 week ago

@stianale did you use T4 or A100? T4 won't work (ESM can't fit, and no bfloat16 support)

I only have access to T4, I don't have Colab Pro. :(

Also tried to make Chai1 work in Kaggle, but the "Running into Runtime Error (cutlassF: no kernel found to launch)" puts an end to that. In Kaggle, 2 x T4 and P100 GPU are available.

guruace commented 1 week ago

@guruace

Dear Alex, Here is my setting-up:

  • conda create -n chai-lab python=3.11
  • pip install git+https://github.com/chaidiscovery/chai-lab.git
  • python >>> import torch, error
  • conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
  • python >>> import torch; >>>torch.cuda.is_available(), ok
  • python examples/predict_structure.py, ok for your fasta content. It took about a few minutes.
  • python examples/predict_structure.py with my 359 aa protein and oligosialic acid (DP8). It took 2 hrs 40 min on my 13900K CPU, 64G main memory, one NVIDIA GA102GL RTX A6000 GPU, Ubuntu 22.04.5. I am now running with oligosialic acid DP12, and will expect to take more than 3 hrs!
  • I did not change any of your params. Only place I changed was protein sequence and the ligand smiles (converted from pdb to SMILES by cheminfo.org)

Thanks for details, that's very helpful.

  • Reference numbers from A100 for default example: around a minute on setup + 3sec/iteration for trunk + 0.3 sec/iteration of diffusion
  • A6000: I'm looking at spec form nvidia and I don't see bfloat16 flops. My guess: it exists for compatibility, but throughput is actually poor to include in specs.
  • FASTA issue: please open separate issue for this. U+FEFF may be just a part of your input
  • GPU memory: I don't see python process with Chai-1, looks wrong.
  • torch setup - pip version should just work, there is no need in making any installs with conda, you could forget to activate conda env?

Here is the detail, I think it must be something wrong with my ligand representation (SMILES string). It took me 12 hours and 20 minutes to produce predicted protein structures, but without ligands on them:

Screenshot from 2024-09-14 01-50-44

guruace commented 1 week ago

I am now trying to test rdkit to see if it can read my ligand string in SMILES

arogozhnikov commented 1 week ago

2 x T4 and P100 GPU are available.

both don't have bfloat16 :/

guruace commented 1 week ago

Hi Alex: 1.Regarding to your question "FASTA issue: please open separate issue for this. U+FEFF may be just a part of your input", the U+FEFF problem occurred when I downloaded xxx.smiles file from cheminfo.org conversion and USED Ubuntu build-in text edit "gedit" to open the xxx.smiles file and to copy the smiles string and pasted into the predict_structure.py which was also opened by gedit. The U+FEFF issue was solved when used 010Editor to open the xxx.smiles & copied the smiles string into predict_structure.py in gedit. This was inspired by GPT-4o.

2.I have checked with my ligand smiles string in rdkit and it is confirmed that rdkit could not create 3D comformer from my ligand (because of too big or too complex?). This is worth noticing since this will happen from time to time.When I used ETKDG of rdkit, it successfully created 3D molecule from my ligand:

from rdkit.Chem import AllChem

# Try embedding with the ETKDG method, which often works better for complex molecules
params = AllChem.ETKDGv3()
result = AllChem.EmbedMolecule(mol_with_hs, params)

if result == 0:
    print("3D conformer successfully generated with ETKDG.")
else:
    print("Failed to generate 3D conformer with ETKDG.")