Not able to run run_from_fasta.py

rmagesh148 commented 6 months ago

I am running into "could not find CIFs" error when running run_from_fasta.py file and please find the command below.

Attached is the screenshot for the same

Command I am running:

python run_from_fasta.py --fasta_paths ./example_data/fasta/1aac_1_A.fasta --model_names model1 --model_paths params/model1.npz --data_dir /data-dir/ --output_dir ./out

E0530 15:38:56.284935 140079475242816 templates.py:849] Could not find CIFs in ./pdb_mmcif/mmcif_files
Traceback (most recent call last):
  File "run_from_fasta.py", line 269, in <module>
    app.run(main)
  File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "run_from_fasta.py", line 186, in main
    obsolete_pdbs_path=FLAGS.obsolete_pdbs_path)
  File "/app/deepfold/deepfold/data/templates.py", line 850, in __init__
    raise ValueError(f'Could not find CIFs in {self._mmcif_dir}')
ValueError: Could not find CIFs in ./pdb_mmcif/mmcif_files
root@24862c4675ea:/app/deepfold#

cy3 commented 6 months ago

Thank you for reporting this error. We have addressed the issue with the path settings and have committed the updated version. Please review the changes and let us know if you encounter any further issues.

rmagesh148 commented 6 months ago

Thank you for the response and I am seeing the same error again. I pulled the latest code and built it and I tried running. Please see the attached screenshots for the same. ![Uploading Screenshot 2024-05-31 at 11.26.57 PM.png…]()

rmagesh148 commented 6 months ago

cy3 commented 6 months ago

You need to replace /path/to/database with your actual database path

rmagesh148 commented 6 months ago


/Users/rmagesh/GradSchool/Research-Phd/deepfold/data_dir```

I tried with the complete path and I tried with the actual path in the docker which is `/app/deepfold/data_dir`

unfortunately both the runs are failing. Thanks!

rmagesh148 commented 6 months ago

python run_from_fasta.py --fasta_paths ./example_data/fasta/1aac_1_A.fasta --model_names model1 --model_paths params/model1.npz --data_dir /app/deepfold/data_dir --output_dir ./out

python run_from_fasta.py --fasta_paths ./example_data/fasta/aa/1aac_1_A.fasta --model_names model1 --model_paths params/model1.npz --data_dir /Users/rmagesh/GradSchool/Research-Phd/deepfold/data_dir --output_dir ./out

cy3 commented 6 months ago

What is the output of python run_from_fasta.py --fasta_paths ./example_data/fasta/aa/1aac_1_A.fasta --model_names model1 --model_paths params/model1.npz --data_dir /app/deepfold/data_dir --output_dir ./out?

rmagesh148 commented 6 months ago

cy3 commented 6 months ago

Could you please check if the database files are properly downloaded and accessible in Docker container?

rmagesh148 commented 6 months ago

I don't see any files inside the data_dir folder and what I did was just build the docker image and run the docker run without --gpus as I am trying it on my local machine

I see that all the database files are properly loaded

cy3 commented 6 months ago

Your local machine should have the following database files:

$DOWNLOAD_DIR/                             # Total: ~ 2.62 TB (download: 556 GB)
    bfd/                                   # ~ 1.8 TB (download: 271.6 GB)
        # 6 files.
    mgnify/                                # ~ 120 GB (download: 67 GB)
        mgy_clusters_2022_05.fa
    pdb70/                                 # ~ 56 GB (download: 19.5 GB)
        # 9 files.
    pdb_mmcif/                             # ~ 238 GB (download: 43 GB)
        mmcif_files/
            # About 199,000 .cif files.
        obsolete.dat
    pdb_seqres/                            # ~ 0.2 GB (download: 0.2 GB)
        pdb_seqres.txt
    small_bfd/                             # ~ 17 GB (download: 9.6 GB)
        bfd-first_non_consensus_sequences.fasta
    uniref30/                              # ~ 206 GB (download: 52.5 GB)
        # 7 files.
    uniprot/                               # ~ 105 GB (download: 53 GB)
        uniprot.fasta
    uniref90/                              # ~ 67 GB (download: 34 GB)
        uniref90.fasta

(Check the download_all_data.sh script to download these files.)

You can use the -v option to mount this folder to your Docker container.

rmagesh148 commented 6 months ago

May I know why do I need to get these database files into my machine / server to just get the features of a fasta ?

cy3 commented 5 months ago

The standard pipeline requires an extensive database search to obtain the features of a FASTA file.