RECETOX / galaxytools

Set of Galaxy tool wrappers developed at RECETOX
MIT License
13 stars 13 forks source link

automate obtaining pathways from PathBank #85

Open xtracko opened 3 years ago

xtracko commented 3 years ago

Make a tool which downloads pathways from https://pathbank.org/downloads, filters out the pathways which don't include any compound from our database (reduced PubChem) and produce pathway-compound pairing.

Most likely, the PathBank-PubChem pairing of compounds could be done by InChI. As of now, PathBank contains 23 compounds with no assigned InChI. For the time being, my wild guess is to ignore them but please write their PathBank ID verbosely on the standard output to notify the user. In this phase, don't bother with whether the pairing is correct or not.

The resulting output should be CSV file with the following columns: recetox_pathway_id, pathbank_id, recetox_cid. The recetox_cid is the foreign key to our compound database, pathbank_id is the original pathway ID (note they it may be null), and recetox_pathway_id is our own generated ID of the pathway since we might have aggregated pathway database from multiple sources.

ElliottJP commented 3 years ago

I've not looked into PathBank but would only need human pathways. Major alternatives (copmlementary databases) are: Reactome (https://reactome.org/); ConsensusPathDB (http://cpdb.molgen.mpg.de/); RaMP-DB (https://github.com/Mathelab/RaMP-DB)

hechth commented 3 years ago

@xtracko @ElliottJP What is the current state of this?