MassBank / RMassBank

Playground for experiments on the official http://bioconductor.org/packages/devel/bioc/html/RMassBank.html
Other
12 stars 15 forks source link

Add SPLASHing to record generation #136

Closed sneumann closed 8 years ago

sneumann commented 8 years ago

Hi, to help the adoption of the SPLASH, so we can either 1) use the R package splashR from https://github.com/berlinguyinca/spectra-hash/tree/master/splashR or 2) do the dirty way and copy https://github.com/berlinguyinca/spectra-hash/blob/master/splashR/R/getSplash.R into RMassBank/R. 1) Adds a dependency that is not (yet) in BioC, 2) adds a (low) maintenance overhead in case of changes to getSplash.R

Question: how to add the SPLASH to a MB record ?

COMMENT: splash10-00zj890000-2aff56dd047fb3dedfff

Alternatively, I could imagine

PK$SPLASH: splash10-00zj890000-2aff56dd047fb3dedfff

but that would require changing the format ... For now, we go with COMMENT.

schymane commented 8 years ago

Great, I agree with all that esp. re option (2) and COMMENT field. What about retro-fitting current MassBank records? Can we use Erik’s parser to add SPLASH comment fields to existing records and update all MassBank? Ideally massbank.jp and massbank.eu.

Another thought: if these are a CH$LINK field, we can even make this hyperlink to MoNA records, similar to: CH$LINK: PUBCHEM CID:343616http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=343616 The SPLASH link could then become CH$LINK: SPLASH: splash10-1zuc000000-0a09603890001265dcbchttp://mona.fiehnlab.ucdavis.edu/#/spectra/splash/splash10-1zuc000000-87d07ddd2ed24b9598d7 This seems even cooler to me, right?

sneumann commented 8 years ago

Hi Erik, could you try to read and write all of http://www.massbank.jp/SVN/OpenData/record and for each file check with e.g. diff or svn diffthat really only one line has changed ? Please script this, as we might have more iterations later when the manuscript is accepted/modified.

Hi Emma, yes, COMMENT was just a straw man. Linking out would be great. But why only Mona ? Why not GNPS/HMDB ? Or link to Google search ? If google, query the full splash or just up to the histogram (or two separate links) ? CH$LINK is a slight abuse of the CH section. PK$LINK would be interesting for that in the future. I'd propose for now to add CH$LINK , and link to google with the full SPLASH.

schymane commented 8 years ago

https://www.google.ch/search?q=splash10-1zuc000000-87d07ddd2ed24b9598d7 works for me. PK$LINK doesn’t exist and I doubt there’ll be another MassBank record format? As far as I’m aware the CH$LINK entries are the only ones we can hyperlink.

sneumann commented 8 years ago

While checking how to add that linking to CH$LINK, I found that we have MS\$RELATED_MS: PREVIOUS_SPECTRUM', 'Dispatcher.jsp?type=disp&id=%s

in https://github.com/MassBank/MassBank-web/blob/81d0039a1af66c031db3cc1dadeea828c4eecf09/modules/apache/htdocs/MassBank/cgi-bin/Disp.cgi#L54 and https://github.com/MassBank/MassBank-web/blob/81d0039a1af66c031db3cc1dadeea828c4eecf09/modules/apache/htdocs/MassBank/cgi-bin/Disp.cgi#L616

which link to related spectra. So far this was intended for ion trap spectral trees (I could not find an example, so it was never used), but I think we can safely add MS$RELATED_MS: SPLASH and the above google link. And I also know where to change MassBank-web :-)

schymane commented 8 years ago

Sounds good. It’s not a compulsory field, so should get through validation without any trouble.

sneumann commented 8 years ago

Done in ae608736dadec8fcc2691f9f3a0ebab3fed13b27 What is also cool is that with MassBank/MassBank-web@62a679ff2f13aa3ea35bed1bec7750fcef015212 MassBank can now display and link to a google search (see status at the bottom in screenshot).

screenshot from 2015-12-18 13 51 34

ermueller commented 8 years ago

Hi Erik, could you try to read and write all of http://www.massbank.jp/SVN/OpenData/record and for each file check with e.g. diff or svn diffthat really only one line has changed ?

Will do that. Though my Laptop is broken right now, so I can't do it today and likely not tomorrow :(

EDIT: Also, I can't write since I don't have the svn password. I'll have the records on the IPB server at the very least.

tsufz commented 8 years ago

IMHO, PK$LINK is not touched by the validator. We have already not documented links in the this section, e.g. the InChIkey.

tsufz commented 8 years ago

I will add the dispatcher settings of MB.eu to the repo.

tsufz commented 8 years ago

Use with caution, the UFZ records are not updated properly. They still include the CC BY SA license.

ermueller commented 8 years ago

IMHO, PK$LINK is not touched by the validator. We have already not documented links in the this section, e.g. the InChIkey.

That is in CH$LINK, isn't it?

sneumann commented 8 years ago

Hi,

I use MS$RELATED_MS which is also part of the specification.

Yours Steffen


I blame Android for the brevity and typos

---- emueller schrieb ----

IMHO, PK$LINK is not touched by the validator. We have already not documented links in the this section, e.g. the InChIkey.

That is in CH$LINK, isn't it?

— Reply to this email directly or view it on GitHubhttps://github.com/MassBank/RMassBank/issues/136#issuecomment-165977483.

tsufz commented 8 years ago

yes

tsufz commented 8 years ago

I use MS$RELATED_MS which is also part of the specification. I am not sure if the dispatcher is touching this field as well or only CH$LINK?

uchem-massbank commented 8 years ago

If someone can send an example record with both fields we can upload and just try it out! Easy enough to delete one record again...can add to one of the tentative spectra databases.

sneumann commented 8 years ago

Hi, Just check the Screenshot in the github issue I sent around in the first mail, the link is there. Gruß Steffen


I blame Android for the brevity and typos

---- uchem-massbank schrieb ----

If someone can send an example record with both fields we can upload and just try it out! Easy enough to delete one record again...can add to one of the tentative spectra databases.

— Reply to this email directly or view it on GitHubhttps://github.com/MassBank/RMassBank/issues/136#issuecomment-165987253.

ermueller commented 8 years ago

Where in the specification is MS$RELATED_MS? I can't find it

That isn't in v1, is it? EDIT: I mean, you looked at the specifications for version 2.0, right?

EDIT2: @sneumann Looking at the code you linked, MS$RELATED_MS should be used for internal linking between Massbank records. I honestly don't think it's a good idea to use it in any other way... And I also don't feel like tags outside of the official specifiation should be used?

uchem-massbank commented 8 years ago

I agree with Erik, these ideas were discussed after that first mail? For an upload to work we need the various fields present in one example record, preferably delivered as a recdata.zip for easy upload, and the dispatcher needs to be updated to see if we can link any or all of the fields. Once all criteria are met I'm happy to try an upload...


From: emueller [notifications@github.com] Sent: Saturday, 19 December 2015 5:30 PM To: MassBank/RMassBank Cc: massbank Subject: Re: [RMassBank] Add SPLASHing to record generation (#136)

Where in the specification is MS$RELATED_MS? I can't find it<www.massbank.jp/manuals/MassBankRecord_en.pdf>

That isn't in v1, is it?

� Reply to this email directly or view it on GitHubhttps://github.com/MassBank/RMassBank/issues/136#issuecomment-166001652.

schymane commented 8 years ago

Oh, hangon, I misunderstood. Was the related Ms field just to define the merged spectra, and hence a special case?


From: uchem-massbank [notifications@github.com] Sent: Saturday, 19 December 2015 9:56 PM To: MassBank/RMassBank Cc: Schymanski, Emma Subject: Re: [RMassBank] Add SPLASHing to record generation (#136)

I agree with Erik, these ideas were discussed after that first mail? For an upload to work we need the various fields present in one example record, preferably delivered as a recdata.zip for easy upload, and the dispatcher needs to be updated to see if we can link any or all of the fields. Once all criteria are met I'm happy to try an upload...


From: emueller [notifications@github.com] Sent: Saturday, 19 December 2015 5:30 PM To: MassBank/RMassBank Cc: massbank Subject: Re: [RMassBank] Add SPLASHing to record generation (#136)

Where in the specification is MS$RELATED_MS? I can't find it<www.massbank.jp/manuals/MassBankRecord_en.pdf>

That isn't in v1, is it?

� Reply to this email directly or view it on GitHubhttps://github.com/MassBank/RMassBank/issues/136#issuecomment-166001652.

— Reply to this email directly or view it on GitHubhttps://github.com/MassBank/RMassBank/issues/136#issuecomment-166023262.

ermueller commented 8 years ago

Just for completeness sake: I'm currently writing a function that adds whatever kind of line you want to the appropriate section of a record (according to "COMMENT", "CH", "AC" or "MS") - just in the case that something similar to this crops up in the future. Also, we'll be able to use it for this.

ermueller commented 8 years ago

So, I added the Splashes to the Opendata files, it works very well :) I did not push to the repo, because we're still discussing things here and I don't have the password anyways.

So, can I just send one of these records (as a zip) to you @uchem-massbank @meowcat @schymane ? I'm not sure since they aren't "ours" per se.

EDIT: Eawag records would be fine though, I reckon?

One thing to note: The whole record stuff resulted in a few records' line feeds changing from LF to CRLF (some people generated these under linux it seems) Is that a problem?

tsufz commented 8 years ago

Would be nice to get them as well!

We never tried to update the repo by committing.

The sync service between massbank.eu and massbank.jp is based on a svn service. So far we understand that, the massbank.eu commits the local DBs to massbank.jp and clones those from massbank.jp to local DBs. However, always a sql comment is delivered such that the worker could update the databases.

Therefore, a pure commit might not be successfully.

schymane commented 8 years ago

Yes sure, send to Emma or Tobias, either of us can try ;)


From: emueller [notifications@github.com] Sent: Sunday, 20 December 2015 4:35 PM To: MassBank/RMassBank Cc: Schymanski, Emma Subject: Re: [RMassBank] Add SPLASHing to record generation (#136)

So, I added the Splashes to the Opendata files, it works very well :) I did not push to the repo, because we're still discussing things here and I don't have the password anyways.

So, can I just send one of these records (as a zip) to you @uchem-massbankhttps://github.com/uchem-massbank @meowcathttps://github.com/meowcat @schymanehttps://github.com/schymane ? I'm not sure since they aren't "ours" per se.

One thing to note: The whole record stuff resulted in a few records' line feeds changing from LF to CRLF (some people generated these under linux it seems) Is that a problem?

� Reply to this email directly or view it on GitHubhttps://github.com/MassBank/RMassBank/issues/136#issuecomment-166129610.