LPDI-EPFL / masif

MaSIF- Molecular surface interaction fingerprints. Geometric deep learning to decipher patterns in molecular surfaces.
Apache License 2.0
582 stars 154 forks source link

Cannot download large PDB structures #30

Open tristanbrown opened 3 years ago

tristanbrown commented 3 years ago

Bio.PDB.PDBList() disallows the downloading of structures >62 chains or >99999 ATOM lines using the 'pdb' (.ent) format. Attempting this gives a "Desired structure doesn't exist" error.

There are a couple of other file_format options for which this is allowed, but it's not completely clear how to utilize these formats in the downstream data_preparation steps. It would be very helpful to be able to use one of these other formats in the MaSIF pipeline.