uibcdf / MolSysMT

Open source library to work with molecular systems
https://www.uibcdf.org/MolSysMT/
Other
12 stars 3 forks source link

Proteins, small molecules and entities names #25

Open dprada opened 3 years ago

dprada commented 3 years ago

MolSysMT has to give a name to proteins, small molecules, and entities. These names can be extracted from an mmtf file, but what happens with other forms? We probably need to implement in Sabueso the tools to find these names together with other attributes from files and databases.

dprada commented 3 years ago

We should turn off the possibility to extract a reliable name for entities. Is problematic right now. And let's keep it this way until Sabueso can solve this problem.

LMMV commented 3 years ago

It would be convenient to explore these websites:

DrugBank is an annotated database of drug and drug target information. It contains extensive data on the nomenclature, ontology, chemistry, structure, function, action, pharmacology, pharmacokinetics, metabolism, and pharmaceutical properties of both small molecule and large molecule (biotech) drugs.

Ligand Expo contains chemical and structural information about small molecules within the structure entries of the Protein Data Bank.

. . .

Various groups have faced the need to analyze various drug/ compound centered analyses often present a need to map attributes from multiple drug databases.

Neo4j repository integrates two of the most prominent open-source drug databases, DrugBank and ChEMBL, with a goal of establishing an integrated data visualization and analysis tool for drug discovery studies (https://github.com/ambf0632/CompoundDB4j).

This very interesting article, although from 2015, lists several chemical compound databases:

https://www.sciencedirect.com/science/article/pii/S1740674915000062