PaddlePaddle / PaddleHelix

Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集
Other
974 stars 219 forks source link

demo_6zcy: ccd prediction (demo_6zcy.json) by helixfold3 is ok, but smiles (demo_6zcy_smiles.json) failed #337

Open Samuel-gwb opened 1 week ago

Samuel-gwb commented 1 week ago

Great work ! I've successfully predict protein-SM complex structure using demo_6zcy.json, just similar with given predicted structure in demo_output. However, use of demo_6zcy_smiles.json failed. Please help to give a solution. Using "sh run_infer.sh", and error message is like: ##################################################################

PaddlePaddle commit: fbf852dd832bc0e63ae31cd4aa37defd829e4c03 FLAGS_new_einsum: True args: Namespace(bf16_infer=False, seed=None, logging_level='DEBUG', model_name='allatomdemo', init model='init_models/HelixFold3-240814.pdparams', precision='fp32', amp_level='O1', infer_times= 1, diff_batch_size=1, input_json='data/demo_6zcy_smiles.json', output_dir='./output', ccd_prep rocessed_path='/data/database/ccd_preprocessed_etkdg.pkl.gz', jackhmmer_binary_path='/home/gwb /miniconda3/envs/helixfold/bin/jackhmmer', hhblits_binary_path='/home/gwb/miniconda3/envs/heli xfold/bin/hhblits', hhsearch_binary_path='/home/gwb/miniconda3/envs/helixfold/bin/hhsearch', k align_binary_path='/home/gwb/miniconda3/envs/helixfold/bin/kalign', hmmsearch_binary_path='/ho me/gwb/miniconda3/envs/helixfold/bin/hmmsearch', hmmbuild_binary_path='/home/gwb/miniconda3/en vs/helixfold/bin/hmmbuild', nhmmer_binary_path='/home/gwb/miniconda3/envs/helixfold/bin/nhmmer ', uniprot_database_path='/data/database/uniprot/uniprot.fasta', pdb_seqres_database_path='/da ta/database/pdb_seqres/pdb_seqres.txt', uniref90_database_path='/data/database/uniref90/uniref 90.fasta', mgnify_database_path='/data/database/mgnify/mgy_clusters_2022_05.fa', bfddatabase path='/data/database/small_bfd/bfd-first_non_consensus_sequences.fasta', small_bfd_database_pa th='/data/database/small_bfd/bfd-first_non_consensus_sequences.fasta', uniclust30_database_pat h='/data/database/uniclust30/uniclust30_2018_08/uniclust30_2018_08', rfam_database_path='/data /database/Rfam-14.9_rep_seq.fasta', template_mmcif_dir='/data/database/pdb_mmcif/mmcif_files', max_template_date='2020-05-14', obsolete_pdbs_path='/data/database/pdb_mmcif/obsolete.dat', p reset='reduced_dbs', maxit_binary='/home/gwb/RationalDesign/helixfold3/maxit-v11.100-prod-src/ bin/maxit') [OBABEL] Temporary file created: /tmp/tmpjptejgmh.mol2 Failed to convert ligand entity 1: {'type': 'ligand', 'smiles': 'CNC(=O)c1nn(C)c2ccc(Nc3nccc(n 3)n4cc(N[C@@H]5CCNC5)c(C)n4)cc12', 'count': 1}, Python argument types in rdkit.Chem.rdmolops.RemoveAllHs(NoneType) did not match C++ signature: RemoveAllHs(RDKit::ROMol mol, bool sanitize=True) Traceback (most recent call last): File "/home/gwb/RationalDesign/helixfold3/inference.py", line 637, in main(args) File "/home/gwb/RationalDesign/helixfold3/inference.py", line 442, in main all_entitys = preprocess_json_entity(args.input_json, args.output_dir) File "/home/gwb/RationalDesign/helixfold3/inference.py", line 87, in preprocess_json_entity all_entitys = preprocess.online_json_to_entity(json_path, out_dir) File "/home/gwb/RationalDesign/helixfold3/inferscripts/preprocess.py", line 290, in online json_to_entity raise RuntimeError(f'[Error] Failed to convert {len(error_ids)}/{len(entities)} entities') RuntimeError: [Error] Failed to convert 1/2 entities (helixfold) gwb@node01:~/RationalDesign/helixfold3$ vi /home/gwb/RationalDesign/helixfold3/inf erence.py (helixfold) gwb@node01:~/RationalDesign/helixfold3$ vi /home/gwb/RationalDesign/helixfold3/inf er_scripts/preprocess.py (helixfold) gwb@node01:~/RationalDesign/helixfold3$ python Python 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:50:21) [GCC 12.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. from rdkit import Chem

smiles = 'CNC(=O)c1nn(C)c2ccc(Nc3nccc(n3)n4cc(N[C@@H]5CCNC5)c(C)n4)cc12' mol = Chem.MolFromSmiles(smiles) if mol is None: ... print("Failed to create molecule from SMILES") ... else: ... print("Molecule created successfully") ... Molecule created successfully

########################################################

RyanGarciaLI commented 1 week ago

Hi,

It seems that your program failed to generate ligand conformation from SMILES via openbabel or rdkit, and the model hasn't started to run yet. May I know what is your openbabel version? And the tools like openbabel or rdkit are conducting random generation. Did you try multiple times?

If you successfully generate conformation, you may have logs as follows.

image
Samuel-gwb commented 1 week ago

Yes, ligand conformation generation from smiles failed. I tried quite a few times, with same failing message. openbabel/rdkit version like this: openbabel 3.1.1 py39h2d01fe1_9 conda-forge rdkit 2024.3.5 pypi_0 pypi rdkit-pypi 2022.9.5 pypi_0 pypi

BTW: cuda not 12.0 but 11.8, cudnn 8.4.0, paddle 2.6.1