Open helenamrusso opened 2 months ago
Thanks! Will take a look and let you know
On Sat, May 11, 2024 at 10:30 AM helenamrusso @.***> wrote:
I would like to report what I believe to be a bug in the metabolomics spectrum resolver. I’m using it to retrieve the cosine similarity of a list of USIs, which overall has been working well and is providing me what I need. However, today I noticed cases in which the JSON file does not show the correct cosine similarity.
Example: Dash interface - cos 0.7962, and is indeed a good match: https://metabolomics-usi.gnps2.org/dashinterface/?usi1=mzspec:MSV000085142:vehicle_LI_C_Se[…]90&cosine=standard&fragment_mz_tolerance=0.1&grid=False https://metabolomics-usi.gnps2.org/dashinterface/?usi1=mzspec:MSV000085142:vehicle_LI_C_Sept_m2:scan:137&usi2=mzspec:GNPS:TASK-2f93c302650d4d928740b85da2aca965-spectra/specs_ms.mgf:scan:106&width=10.0&height=6.0&mz_min=None&mz_max=None&max_intensity=125&annotate_precision=4&annotation_rotation=90&cosine=standard&fragment_mz_tolerance=0.1&grid=False
JSON - cos 0.01299: https://metabolomics-usi.gnps2.org/json/mirror/?usi1=mzspec:MSV000085142:vehicle_LI_C_Sept[…]nnotate_peaks=%5B%5B95.08549499511719%5D%2C%20%5B%5D%5D https://metabolomics-usi.gnps2.org/json/mirror/?usi1=mzspec:MSV000085142:vehicle_LI_C_Sept_m2:scan:137&usi2=mzspec:GNPS:TASK-2f93c302650d4d928740b85da2aca965-spectra/specs_ms.mgf:scan:106&width=10.0&height=6.0&mz_min=None&mz_max=None&max_intensity=125&annotate_precision=4&annotation_rotation=90&cosine=standard&fragment_mz_tolerance=0.1&grid=True&annotate_peaks=%5B%5B95.08549499511719%5D%2C%20%5B%5D%5D
I manually checked many, and overall these values match exactly. But with big lists, I’m wondering how many will be an example like this one.
— Reply to this email directly, view it on GitHub https://github.com/mwang87/MetabolomicsSpectrumResolver/issues/199, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXSEAC3NV7F74SX5HETJLZBZISNAVCNFSM6AAAAABHSCWUZGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI4TCMBQGA3DENI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I did more investigation into this issue and I have some more information.
Please consider this USI as an example: mzspec:MSV000085142:vehicle_LI_C_Sept_m2:scan:137
in the web interface, the precmz is 188.1761 in the JSON file, the precmz is 709.1234
I checked this dataset in massive and filtered for the filename (https://massive.ucsd.edu/ProteoSAFe/dataset_files.jsp?task=a1375e1eca11456f9bed4b71c3f12f8d#%7B%22table_sort_history%22%3A%22main.collection_asc%22%2C%22main.file_descriptor_input%22%3A%22vehicle_LI_C_Sept_m2%22%7D), and there are two files with the same filename, but in different folders (one negative, and another one positive data).
I downloaded both files and inspected the 137 scan. in positive mode: m/z 188.1761 in negative mode: m/z 709.1235
therefore, in this case, dash interface is showing positive data, JSON is showing negative data.
PS: as a background... I got this USI (mzspec:MSV000085142:vehicle_LI_C_Sept_m2:scan:137) out of fastMASST searches, and the fastMASST result is pointing to this USI as 188 precmz.
Thanks for the detailed investigation. This is an interesting edge case. The USI standard details how to distinguish multiple runs with the file name in a single dataset, using the subfolder mechanism in section 3.6.1.
So in this case, the unique USIs would be:
mzspec:MSV000085142:[pos-mzXML]vehicle_LI_C_Sept_m2:scan:137
mzspec:MSV000085142:[LI carnitine treatment_Yiming/neg-mzXML]vehicle_LI_C_Sept_m2:scan:137
However, our resolver doesn't seem to support this format, nor does the general MassIVE resolver. It does seem to return all matching files though.
So it seems like the solution must be two-fold:
And maybe:
I would like to report what I believe to be a bug in the metabolomics spectrum resolver. I’m using it to retrieve the cosine similarity of a list of USIs, which overall has been working well and is providing me what I need. However, today I noticed cases in which the JSON file does not show the correct cosine similarity.
Example: Dash interface - cos 0.7962, and is indeed a good match: https://metabolomics-usi.gnps2.org/dashinterface/?usi1=mzspec:MSV000085142:vehicle_LI_C_Se[…]90&cosine=standard&fragment_mz_tolerance=0.1&grid=False
JSON - cos 0.01299: https://metabolomics-usi.gnps2.org/json/mirror/?usi1=mzspec:MSV000085142:vehicle_LI_C_Sept[…]nnotate_peaks=%5B%5B95.08549499511719%5D%2C%20%5B%5D%5D
I manually checked many, and overall these values match exactly. But with big lists, I’m wondering how many will be an example like this one.