CCMS-UCSD / GNPS_Workflows

Public Workflows at GNPS
https://gnps.ucsd.edu/
Other
52 stars 43 forks source link

[GC] Creating a GNPS export for MZmine/ADAP #232

Closed robinschmid closed 4 years ago

robinschmid commented 4 years ago

Hey everyone,

what was already accomplished in MZmine?

And what do I need to do for the ADAP integration? I can easily build an export on Sunday / start of next week.

Please provide sample files and a description of everything that is needed. To make it easy for users, the export will be similar to the GNPS-FBMN export in one module.

@lfnothias @aaksenov1

mwang87 commented 4 years ago

The proposed endpoint for automatic submission will be similar to the feature based one:

gnps-quickstart.ucsd.edu/uploadanalyzegcnetworking

Tracking issue here: https://github.com/mwang87/GNPS_quickstart/issues/11

lfnothias commented 4 years ago

@robinschmid, we could imagine something similar to the GNPS Export/Submit module but sending 3 files (1 optional) on the GC workflow through the quickstart interface https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=9f6fd12a260744b29e9440a69bc22e73

I propose the name "Export/Submit to GNPS (GC-MS with ADAP processing)", maybe renaming the other one to "Export/Submit to GNPS (LC-MS/MS in DDA)"

This is the input files used for the format: stilton_job.zip

lfnothias commented 4 years ago

@robinschmid note that the novel address for the documentation will be https://ccms-ucsd.github.io/GNPSDocumentation/gc-ms-documentation/

robinschmid commented 4 years ago

@mwang87 great that you can provide this end point for direct submission.

I initially created this issue as a first step to a controlled and easy export for the mzmine GC GNPS workflow. However, adding the direct submission is going to boost this drastically. We will definitely update the module names.

robinschmid commented 4 years ago

@lfnothias Can you provide a test dataset? (on massive?)

How do you actually create/pick the Kovatz times? I could create something for this - maybe with the help of Ansgar (annexhc, he works on GC modules for MZmine and GC data processing).

lfnothias commented 4 years ago

Yeah check out this https://ccms-ucsd.github.io/GNPSDocumentation/gc-ms-deconvolution/

The carbon marker file has to be provided by the user. For the GC processing part, we will rely on the ADAP-GC workflow that we have already tested.

robinschmid commented 4 years ago

The Kovatz standard run (mzML) is not uploaded to the massive MSV000084226 of the documentation. It would be nice to include it - and I need it to enable creation of these files

lfnothias commented 4 years ago

Currently, the user him/herself has to build up the kovac table. Do you want to automated that process?

lfnothias commented 4 years ago

Here are somes files https://ucsdcloud-my.sharepoint.com/:f:/g/personal/lnothiasscaglia_ucsd_edu/EonwJcS91DZFgx-HgVD22RYB8qxbNh4agh2D6mOJPMeYwQ?e=YhpVI0 I added a representative batch file. Note that it was processed with ADAP-in-MZmine version 2.23 I believe. https://github.com/du-lab/ADAP-in-MZmine2

robinschmid commented 4 years ago

Why did you use a special mzmine version? I can also use the current branch right?

I will try to build something quick for Kovatz.

lfnothias commented 4 years ago

At that time, it was the only way to access all the ADAP modules.

robinschmid commented 4 years ago

I have collected some questions and remarks to be 100% sure about the formats as they are not completely described by the documentation, yet.

Quantification table

A question about the file formats. The upper one is the original csv export of MZmine and the lower one is the file that you gave me.

image image

Kovats file

image

mgf file (example that louis sent)

image

@lfnothias

lfnothias commented 4 years ago

Thanks @robinschmid !

Quantification table:

Kovatz:

MGF

robinschmid commented 4 years ago

I got some news.

This is the first test job

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=7a4fa67fe602484e91d9bb772d112ff2

The MZmine output

20191105_kovats_dro.zip

robinschmid commented 4 years ago

Does anyone know why the job fails? image

aaksenov1 commented 4 years ago

I wonder if this is because the balance score file is missing. If that's the case, we would need to provide a dummy file for Mzmine/MSDial jobs or ask Ming to make that optional. Also, the latest release is 15, so try a clone the latest workflow, maybe that would do it. here's an example of a job on release 15: https://proteomics2.ucsd.edu/ProteoSAFe/status.jsp?task=5d7f6a60f1a14126bfc8873231f4e7ca

On Tue, Nov 5, 2019 at 11:47 AM robinschmid notifications@github.com wrote:

Does anyone know why the job fails? [image: image] https://user-images.githubusercontent.com/10366914/68240530-678ba000-000d-11ea-82e8-a4d6d38e9324.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/232?email_source=notifications&email_token=AM5MGUUUNNFZ253I3L26RGDQSHETLA5CNFSM4JFB5W5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDECU5Q#issuecomment-549988982, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5MGUX6QYTQMIUMVRRBZZDQSHETLANCNFSM4JFB5W5A .

robinschmid commented 4 years ago

Will try that. I have just used the one on the GNPS website.

Here a video on the Kovats index extraction https://youtu.be/XodHMJcuwnk

robinschmid commented 4 years ago

It still would not run.

https://proteomics2.ucsd.edu/ProteoSAFe/status.jsp?task=ad46dca041e14f7f9b0c4a069836d199

@mwang87 do we need this balance score file for MZmine / MSDIAL jobs?

lfnothias commented 4 years ago

It can be similar to the MGF exported by the MGF export in MZmine. For the quant table, it can be the simple table format (enclosed) or like the FBMN. If there is additional column that should be included, it is also possible and will update the parser. MZmine_export_files.zip

aaksenov1 commented 4 years ago

The question is about balance integrals file that is generated automatically by the MSHub workflow. This would be a Ming question, I haven't seen how the file looks like yet. Robin - what you should try, clone and launch this job using your RI markers file: https://proteomics2.ucsd.edu/ProteoSAFe/status.jsp?task=6c1a4ba480134298a5364d61469e1058 The Kovats and DRO inputs that you used were obtained with the protocol of this study.

On Tue, Nov 5, 2019 at 4:11 PM lfnothias notifications@github.com wrote:

It can be similar to the MGF exported by the MGF export in MZmine. For the quant table, it can be the simple table format (enclosed) or like the FBMN. If there is additional column that should be included, it is also possible and will update the parser. MZmine_export_files.zip https://github.com/CCMS-UCSD/GNPS_Workflows/files/3811878/MZmine_export_files.zip

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/232?email_source=notifications&email_token=AM5MGUSJQFPN34WTMK5V5ULQSIDSVA5CNFSM4JFB5W5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDEZIRQ#issuecomment-550081606, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5MGURLMXYDYYBL3PFASWDQSIDSVANCNFSM4JFB5W5A .

robinschmid commented 4 years ago

To test the MZmine output, I would need to change all the files. However, there is no chance to delete the mgf, quant table or balance score file.

@lfnothias the export is done - and similar to the FBMN one.

robinschmid commented 4 years ago

See #270 for the latest MZmine export format for testing. Feel free to use it for tests.

Does anyone know what this file is supposed to do? balance score file

aaksenov1 commented 4 years ago

OK, after some tinkering, here's a successful job with MZmine-generated file: https://proteomics2.ucsd.edu/ProteoSAFe/status.jsp?task=0088a7daa8c542c195a6b0968193cf27 Robin - As I mentioned, the commas between names and RT values need to be replaced by semicolons, no spaces as in attached example.

On Wed, Nov 6, 2019 at 8:17 AM robinschmid notifications@github.com wrote:

See #270 https://github.com/CCMS-UCSD/GNPS_Workflows/issues/270 for the latest MZmine export format for testing. Feel free to use it for tests.

Does anyone know what this file is supposed to do? balance score file

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/232?email_source=notifications&email_token=AM5MGUQKP3T3JUQ5CEJ42LTQSLUYBA5CNFSM4JFB5W5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDHDB6I#issuecomment-550383865, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5MGUVDSIVBZR5473LMJK3QSLUYBANCNFSM4JFB5W5A .

lfnothias commented 4 years ago

@aaksenov1 What is the balance score file in that case ?

aaksenov1 commented 4 years ago

Automatically generated by the MSHub workflow.

On Wed, Nov 6, 2019 at 11:11 AM lfnothias notifications@github.com wrote:

@aaksenov1 https://github.com/aaksenov1 What is the balance score file in that case ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/232?email_source=notifications&email_token=AM5MGUQATOMFGB6JDBUURT3QSMJGXA5CNFSM4JFB5W5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDHVAKQ#issuecomment-550457386, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5MGUXCN4XW7T6O6GAVYODQSMJGXANCNFSM4JFB5W5A .

lfnothias commented 4 years ago

@aaksenov1 But that a MZmine workflow right ? Is it a mandatory file for the workflow to work ?

mwang87 commented 4 years ago

This file should be ignored by mzmine2 and other tools. If workflow fails without it it is a bug and we will fix

On Wed, Nov 6, 2019 at 11:18 AM aaksenov1 notifications@github.com wrote:

Automatically generated by the MSHub workflow.

On Wed, Nov 6, 2019 at 11:11 AM lfnothias notifications@github.com wrote:

@aaksenov1 https://github.com/aaksenov1 What is the balance score file in that case ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/CCMS-UCSD/GNPS_Workflows/issues/232?email_source=notifications&email_token=AM5MGUQATOMFGB6JDBUURT3QSMJGXA5CNFSM4JFB5W5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDHVAKQ#issuecomment-550457386 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AM5MGUXCN4XW7T6O6GAVYODQSMJGXANCNFSM4JFB5W5A

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/232?email_source=notifications&email_token=AAAXSEDEONR2WB2GZVE4NDTQSMKBNA5CNFSM4JFB5W5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDHVWPY#issuecomment-550460223, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXSECNTL62ELGOG4CBPG3QSMKBNANCNFSM4JFB5W5A .

-- Ming Wang PhD (650) 646 4986

robinschmid commented 4 years ago

Then we have an inconsistency with the documentation. Documentation says: Kovats file: csv - comma separated

Is it possible to change the parser on GNPS side?

robinschmid commented 4 years ago

Closed as the MZmine export seems to work. See issue #271 for the arising exception with the semi-colon separated kovats files.

aaksenov1 commented 4 years ago

So, Robin, is the separator now comma or semicolon? Melissa is checking the tutorial to make sure it shows the semicolon separator as in the attached file. The job runs successfully with this one.

On Thu, Nov 7, 2019 at 8:44 AM robinschmid notifications@github.com wrote:

Closed as the MZmine export seems to work. See issue #271 https://github.com/CCMS-UCSD/GNPS_Workflows/issues/271 for the arising exception with the semi-colon separated kovats files.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/232?email_source=notifications&email_token=AM5MGUXGLTN5W26C2N3FK2TQSRAXTA5CNFSM4JFB5W5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDNBLBA#issuecomment-551163268, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5MGUU4PXOFEUGKFIV2LUDQSRAXTANCNFSM4JFB5W5A .

robinschmid commented 4 years ago

You just need to specify, implement into the GNPS workflow, and I will change accordingly. See related issues:

271

290

Documentation says - comma-separated