Open KevinCrp opened 1 year ago
Hi @KevinCrp
PDBs are the primary format for protein graphs as this is what is typically used in the community. Molecules are comparatively simpler to parse - there are a lot of protein-specific fields recorded in PDB files that I don't believe are explicit in Mol2 files.
For example, chains and residue types have to be inferred from atom types and connectivity. I believe if, for example, you had an ALA with an unresolved/missing Cb atom it would not be possible to distinguish this from a GLY. Furthermore, Mol2 files don't contain bfactors, occupancy etc.
This is not to say it's impossible; it's certainly doable. However, I don't have bandwidth to implement this myself. If you want to make a PR to implement this I'd be more than happy to support you.
Files used to construct protein graphs must be in PDB format. Whereas, molecular graphs may also be constructed from MOL2 or SDF.
Is it possible to add a protein graph constructor from the MOL2 file? I don't understand why the protein graphs are limited to PDB files.
I have tested converting MOL2 to PDB to construct a protein graph, but the conversion does not always work well.