sokrypton / ColabFold

Making Protein folding accessible to all!
MIT License
1.8k stars 463 forks source link

AF2 colab error using template #218

Open mahdikalhori opened 2 years ago

mahdikalhori commented 2 years ago

Expected Behavior

successful structure prediction by AF2 just using one specific template structure (representing one of the known conformational state, e.g open state, for an ortholog) to put bias toward a unique functional conformational state as ion channels can have for example open, closed, desensitized state.

Current Behavior

in the run gives the following error

WARNING: found GPU Tesla K80: limited to total length < 1000 Downloading alphafold2 weights to .: 100%|██████████| 3.82G/3.82G [00:39<00:00, 103MB/s] 2022-04-26 09:11:23,125 Running colabfold 1.3.0 (d6b0670552ce3f39e20b69f9e01521ca39562336) 2022-04-26 09:11:23,128 Found 8 citations for tools or databases 2022-04-26 09:11:31,246 Query 1/1: ASIC_open_AF_69ea9 (length 527) COMPLETE: 100%|██████████| 150/150 [elapsed: 00:02 remaining: 00:00] 2022-04-26 09:11:39,516 Sequence 0 found templates: [b'wtn4_A' b'wtn4_A' b'wtn4_B' b'wtn4_C'] 2022-04-26 09:11:40,506 Running model_1

KeyError Traceback (most recent call last) /usr/local/lib/python3.7/dist-packages/ml_collections/config_dict/config_dict.py in getitem(self, key) 902 try: --> 903 field = self._fields[key] 904 if isinstance(field, FieldReference):

KeyError: 'data'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) 6 frames /usr/local/lib/python3.7/dist-packages/ml_collections/config_dict/config_dict.py in getattr(self, attribute) 826 try: --> 827 return self[attribute] 828 except KeyError as e:

/usr/local/lib/python3.7/dist-packages/ml_collections/config_dict/config_dict.py in getitem(self, key) 908 except KeyError as e: --> 909 raise KeyError(self._generate_did_you_mean_message(key, str(e))) 910

KeyError: "'data'"

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last) in () 51 pair_mode=pair_mode, 52 stop_at_score=float(100), ---> 53 prediction_callback=prediction_callback, 54 )

/usr/local/lib/python3.7/dist-packages/colabfold/batch.py in run(queries, result_dir, num_models, num_recycles, model_order, is_complex, model_type, msa_mode, use_templates, custom_template_path, use_amber, keep_existing_results, rank_by, pair_mode, data_dir, host_url, stop_at_score, recompile_padding, recompile_all_models, zip_results, prediction_callback, save_single_representations, save_pair_representations, training, use_gpu_relax, stop_at_score_below) 1185 stop_at_score_below=stop_at_score_below, 1186 prediction_callback=prediction_callback, -> 1187 use_gpu_relax=use_gpu_relax, 1188 ) 1189 except RuntimeError as e:

/usr/local/lib/python3.7/dist-packages/colabfold/batch.py in predict_structure(prefix, result_dir, feature_dict, is_complex, use_templates, sequences_lengths, crop_len, model_type, model_runner_and_params, do_relax, rank_by, random_seed, stop_at_score, stop_at_score_below, prediction_callback, use_gpu_relax) 254 model_name, 255 crop_len, --> 256 use_templates, 257 ) 258 else:

/usr/local/lib/python3.7/dist-packages/colabfold/batch.py in batch_input(input_features, model_runner, model_name, crop_len, use_templates) 186 ) -> model.features.FeatureDict: 187 model_config = model_runner.config --> 188 eval_cfg = model_config.data.eval 189 crop_feats = {k: [None] + v for k, v in dict(eval_cfg.feat).items()} 190

/usr/local/lib/python3.7/dist-packages/ml_collections/config_dict/config_dict.py in getattr(self, attribute) 827 return self[attribute] 828 except KeyError as e: --> 829 raise AttributeError(e) 830 831 def setitem(self, key, value):

AttributeError: "'data'"

Steps to Reproduce (for bugs)

Please make sure to reproduce the issue after a "Factory Reset" in Colab. If running locally ypdate you local installation colabfold_batch to the newest version. Please provide your input if you can share it.

ColabFold Output (for bugs)

Please make sure to also post the complete ColabFold output. You can use gist.github.com for large output.

Context

trying to have full length predicted structure of each functional states of some targeted ion channels, forcing the prediction toward each state with some defined structures that only represent the desired model state.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

mahdikalhori commented 2 years ago

I truly appreciate if you help solving this issue while using template.

thanks Mahdi

milot-mirdita commented 2 years ago

Could you please upload the query and all settings used when you ran into this error?

mahdikalhori commented 2 years ago

Thank you so much for the reply!

Query sequence (P78348|ASIC1_HUMAN Acid-sensing ion channel 1 OS=Homo sapiens OX=9606 GN=ASIC1):

MELKAEEEEVGGVQPVSIQAFASSSTLHGLAHIFSYERLSLKRALWALCFLGSLAVLLCVCTERVQYYFHYHHVTKLDEVAASQLTFPAVTLCNLNEFRFSQVSKNDLYHAGELLALLNNRYEIPDTQMADEKQLEILQDKANFRSFKPKPFNMREFYDRAGHDIRDMLLSCHFRGEVCSAEDFKVVFTRYGKCYTFNSGRDGRPRLKTMKGGTGNGLEIMLDIQQDEYLPVWGETDETSFEAGIKVQIHSQDEPPFIDQLGFGVAPGFQTFVACQEQRLIYLPPPWGTCKAVTMDSDLDFFDSYSITACRIDCETRYLVENCNCRVHMPGDAPYCTPEQYKECADPALDFLVEKDQEYCVCEMPCNLTRYGKELSMVKIPSKASAKYLAKKFNKSEQYIGENILVLDIFFEVLNYETIEQKKAYEIAGLLGDIGGQMGLFIGASILTVLELFDYAYEVIKHKLCRRGKCQKEAKRSSADKGVALSLDDVKRHNPCESLRGHPAGMTYAANILPHHPARGTFEDFTC

template structure of chicken isoform representing the conformational state of interest to be used, attached as 4ntw.zip containing 4ntw.cif which was downloaded from RCSB PDB ( note that the ion channel protein of interest to be used as template is the chain A and the other two sequences present in the file are toxin peptide modulators of the ion channel protein)

The only settings changed from default ColabFold: AlphaFold2 using MMseqs2 notebook are:

checking the use_amber 4ntw.zip

choosing Alphafold2_multimer_v2

tubiana commented 2 years ago

I got the same problem, but not with templates. If you choose Alphafold2_multimer_v2 the soft is expecting to have 2 sequences in the fasta; separated with :. You should use ptmor auto.

I haven't manage to make it works multimer_v2 with template. If you did in the meantime could you share your feedback :) ?