PRIDE-Archive / px-submission-tool

ProteomeXchange data submission tool
3 stars 6 forks source link

"title" field required in mzTab files #48

Closed bittremieux closed 6 years ago

bittremieux commented 6 years ago

Validation of mzTab files fails if the title field is missing in the metadata section (for a complete submission). The mzTab specification indicates that this is an optional field though.

The log contains the following message:

2018-05-16 14:30:09,305 ERROR [pool-1-thread-9] u.a.e.p.d.m.m.DefaultMzTabSectionValidator [DefaultMzTabSectionValidator.java:35] Missing title in mzTab metadata section!
2018-05-16 14:30:09,305 ERROR [pool-1-thread-9] u.a.e.p.d.m.p.MzTabParser [MzTabParser.java:118] An error occurred while parsing a section of the mzTab file, 'The current subproduct DOES NOT VALIDATE'
2018-05-16 14:30:09,305 ERROR [pool-1-thread-9] u.a.e.p.g.t.FileScanAndValidationTask [FileScanAndValidationTask.java:366] Invalid mzTab file 'annsolo_oms_b1906_293T_proteinID_01A_QE3_122212.mztab', MAIN ERROR: 'The current subproduct DOES NOT VALIDATE'. PLEASE, REFER TO LOG FILES FOR MORE DETAILED INFORMATION

Progress from step 3 (add files) to step 4 (relationship between files) is impossible if the mzTab files don't contain the title field. Instead a red pop-up is shown with a list of the mzTab files that are supposedly invalid.

Tobias-Ternent commented 6 years ago

Hi @bittremieux,

Yes, the title field is optional according to the mzTab spec, but so is a lot of other information too. Generally speaking, it's a very flexible format.

In order to make a 'complete' submission to PRIDE, we require that certain information is provided, which includes a title.

ypriverol commented 6 years ago

@bittremieux Going forward on this. The title is also mandatory in mzTab. Thanks a lot, fo this effort and please let us know if you need any other help. For us is really important that we move into mzTab direction. We are not making anything with the title, then you can provide whatever is more convenient for you.

bittremieux commented 6 years ago

@ypriverol Based on the mzTab v1.0.0 specification linked on the PSI website it seems to be optional though (p. 10)?

I'm not debating the usefulness of having a title. It's just that I'm exporting my own mzTab files, and as I tried to follow the mzTab standard I was a bit surprised that the submission tool initially didn't accept my mzTab files as the requirement of having a title isn't specified in the mzTab specification.

Similarly the submission tool gives a warning if the mzTab-ID field isn't present, although it does let me submit the mzTab files without that one.

Anyway, I'll make sure my tool includes a title field so its mzTab output is fully supported. Thanks for the feedback.

ypriverol commented 6 years ago

thanks for your understanding!!!

ypriverol commented 6 years ago

Hi @bittremieux we were processing you dataset and realise that is an mztab submission with only psms based on spectral library search. The first implemention of the mztab in pride is forcing users to submit also the protein section.

I know you don’t have that information and probably our curators will be back to you to make a partial submission. We will discuss in pride and proteomexchange this use case and see if we will support soon complete submissions with only peptide information.

In the meantime if you have a downstream analysis that map those peptides to proteins you can submit those outputs.

Regards Yasset

bittremieux commented 6 years ago

@ypriverol Indeed, I don't have a direct link to proteins because I'm doing a spectral library search. De novo would similarly not have any direct protein information, and even though these aren't the most prevalent modes of identification it might be useful to support those kind of submissions as well.

At the moment I'm not doing any downstream analysis. Attilla already helped me out and converted my submission to a partial submission.

Thanks for looking into this stuff and for the speedy resolution. 👍

ypriverol commented 6 years ago

I will take this submission as an example of new use cases to move then forward. I will keep you updated.