jacobwindsor / pubchem-ranker

Ranks compounds by number of BioAssays or BioSystems in PubChem
MIT License
2 stars 1 forks source link

Data set manipulation #11

Open DeniseSl22 opened 7 years ago

DeniseSl22 commented 7 years ago

Hi Jacob/Egon,

The dataset from De lacy Costello was manipulated first, by checking which metabolites were already in HMDB and/or ChEBI. Perhaps we should include a similar feature, where the data is compared with Wikidata, to find out if certain compounds already are entered in here?

DeniseSl22 commented 7 years ago

Perhaps this can be done by including Wikidata + WPs as databases :)? If yes, then we can merge this issue with include other databases.