mkuhn / sider

SIDER – Side Effect Resource
Creative Commons Zero v1.0 Universal
18 stars 3 forks source link

ERRORs in CID identifiers #10

Open Jordi-Valls opened 2 years ago

Jordi-Valls commented 2 years ago

Dear SIDER community,

I try to recover Inchi keys of Compounds using STITCH database, I have read in the documentation that IDs correspond with STITCH ids "flat/Stereo", however when I download the chemical.inchikeys file from STITCH database (http://stitch.embl.de/cgi/download.pl?UserId=iUMs7fqvlJcS&sessionId=AZh6cVRwv0Lv), the flat an stereo IDS do not match with those reported in SIDER. The falt sufix ID start with CIDm and Stereo is CIDs. These IDs do not match with SIDER, which their sufix start with CID. In addition, the first CID reported by SIDER: CID100000085 corresponds with carnitine, this ID is not included in STITCH, so I cannot find their particlular Inchi key....

THere are any way to obtain the correct IDs from SIDER and STITCH?

Thanks

raphaelobinna commented 2 months ago

I think it's been a while SIDER updated their data, to recover inchKeys try using the compound names and make a call to pubchem's pugREST. For the CID you posted , it matches carnitine. With pubchem you can make a GET request to https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/carnitine.

To read more about pubchem's pugrest for more streamlined use cases, just check their website.