paulaberry / beta_mining

A Python package to mine the AlphaFold2 monomer database for secondary structure predictions.
GNU General Public License v3.0
1 stars 1 forks source link

AlphaFold predictions produce a blank header #1

Open jmrussell opened 2 years ago

jmrussell commented 2 years ago

Hi Paula!

I'm trying to run beta mining for a project here, but when create_model_metainfo is run, prody returns an empty "pdb_head" varaible. These are predictions I generated with our in-house AF2 install. Any idea why that might be the case?

Jonathon

Traceback (most recent call last): File "/n/core/Bioinformatics/analysis/Zanders/sez/cbio.sez.100/results/beta_mining/bin/./beta_mining", line 124, in <module> main() File "/n/core/Bioinformatics/analysis/Zanders/sez/cbio.sez.100/results/beta_mining/bin/./beta_mining", line 112, in main beta_mining_algorithm.main(settings) File "/n/core/Bioinformatics/analysis/Zanders/sez/cbio.sez.100/results/beta_mining/bin/../beta_mining/beta_mining_algorithm.py", line 205, in main analyze_structure(file, config_settings, target_parameters, output_files) File "/n/core/Bioinformatics/analysis/Zanders/sez/cbio.sez.100/results/beta_mining/bin/../beta_mining/beta_mining_algorithm.py", line 44, in analyze_structure model, meta_dictionary, af_object, af_sequence = beta_mining_functions.create_model_metainfo(filename) File "/n/core/Bioinformatics/analysis/Zanders/sez/cbio.sez.100/results/beta_mining/bin/../beta_mining/beta_mining_functions.py", line 222, in create_model_metainfo "database": pdb_head["polymers"][0].dbrefs[0].database, TypeError: 'NoneType' object is not subscriptable

paulaberry commented 2 years ago

Hi Jonathon, can you give me an example of the .pdb prediction file header, or a an entire file? I made the scripts to work on the EMBL AlphaFold2 database files and the various methods of extracting the metadata from the headers is finicky. With an example I can try to make it more robust or put in a bypass for this step.

jmrussell commented 1 year ago

Hi Paula,

The pdbs I'm using have no header information (they're from ESM-fold) - is that a requirement to run the script at all?