Open dprada opened 3 years ago
We should turn off the possibility to extract a reliable name for entities. Is problematic right now. And let's keep it this way until Sabueso can solve this problem.
It would be convenient to explore these websites:
DrugBank is an annotated database of drug and drug target information. It contains extensive data on the nomenclature, ontology, chemistry, structure, function, action, pharmacology, pharmacokinetics, metabolism, and pharmaceutical properties of both small molecule and large molecule (biotech) drugs.
Ligand Expo contains chemical and structural information about small molecules within the structure entries of the Protein Data Bank.
https://www.ebi.ac.uk/pdbe-srv/pdbechem/ PDBChem provides chemical components as ligands, small molecules, and monomers referred to in PDB entries.
. . .
Various groups have faced the need to analyze various drug/ compound centered analyses often present a need to map attributes from multiple drug databases.
Neo4j repository integrates two of the most prominent open-source drug databases, DrugBank and ChEMBL, with a goal of establishing an integrated data visualization and analysis tool for drug discovery studies (https://github.com/ambf0632/CompoundDB4j).
This very interesting article, although from 2015, lists several chemical compound databases:
https://www.sciencedirect.com/science/article/pii/S1740674915000062
MolSysMT has to give a name to proteins, small molecules, and entities. These names can be extracted from an mmtf file, but what happens with other forms? We probably need to implement in Sabueso the tools to find these names together with other attributes from files and databases.