CCMS-UCSD / GNPS_Workflows

Public Workflows at GNPS
https://gnps.ucsd.edu/
Other
52 stars 43 forks source link

[feature request] Spectra metadata completeness #747

Closed Adafede closed 1 year ago

Adafede commented 3 years ago

Hi , it seems like you back-calculate some metadata (chemical classes, InChIKey, InChIKey-Planar) from the user input (SMILES only?) of a given spectrum ID, which is really nice! However, it looks like the metadata values are not always filled. They mainly look to rely on the SMILES, however many spectrumIDs have only InChI as structural identifier. Some InChIs have their InChIKeys (and planar) linked to it, others do not.

Would it be possible to do something like:

  1. if(is.null(SMILES)){calculate smiles(InChI)}
  2. calculate InChIKey+Clasification(SMILES)
  3. add 2 additional (or 1 at least) metadata starting from SMILES: SMILES-planar,InChI-planar?

Best,

mwang87 commented 3 years ago

Do you mean in GNPS Networking or in the GNPS Library pages?

Adafede commented 3 years ago

Wow, that was quick!

GNPS Networking, when submitting a job and then looking at the table block=main&file=DB_result

mwang87 commented 3 years ago

Yes, that makes sense. I think it would be pretty straightforward, do you think you'd be able to submit a quick PR for it? Then we could test it out together?

Best,

Ming

Adafede commented 3 years ago

Huh, will have to dive into some funny parts of GNPS!

I'll do my best and hopefully submit one soon!

Thank you again for the responsiveness!

mwang87 commented 3 years ago

Definitley, its kind of a mess and a rabbit hole so apologies. The business end of the code I think you're looking for is here:

https://github.com/CCMS-UCSD/GNPS_Workflows/blob/master/metabolomics-snets-v2/tools/metabolomicsnetsv2/scripts/getGNPS_library_annotations.py#L143

Adafede commented 3 years ago

You were right, more than straightforward! Did the PR for the smiles <-> inchi conversion (5b5c044) as it was easy thanks to the API!

Do you have a similar service for generating planar smiles and inchis or think it could be easily adapted? Might need a bit more work else!