PRIDE-Archive / pride-curation-scripts

Useful PRIDE Pipelines curation scripts
0 stars 0 forks source link

IOException (due to missing <Sample> element in .mzid files ?) #23

Closed germa closed 5 years ago

germa commented 5 years ago

more ./log/validation/1-20190403-PXD013343-RESUB-0001-validation.log

qsdb-3_comprehensive-blib_5ppm.mzid:

Encountered an error executing the step java.io.IOException: ERROR at uk.ac.ebi.pride.submission.AssayFileScanner.findAndMapSummary(AssayFileScanner.java:275) at uk.ac.ebi.pride.submission.AssayFileScanner.launchJobAndSubscribe(AssayFileScanner.java:189) at uk.ac.ebi.pride.submission.AssayFileScanner.scan(AssayFileScanner.java:117) at uk.ac.ebi.pride.curation.tasklet.statistic.AssayFileSummaryItemReader.read(AssayFileSummaryItemReader.java:49) at uk.ac.ebi.pride.curation.tasklet.statistic.AssayFileSummaryItemReader.read(AssayFileSummaryItemReader.java:22)

I assume the reason is the missing element in the .mzid file

germa commented 5 years ago

This is a little bit weird, since is a required child of the optional element. AnalysisSampleCollection

sureshhewabi commented 5 years ago

if you see any IO Exception in the AssayFileScanner.findAndMapSummary, Please check the command I highlighted in the image attached. It should be mostly related with Peak file mismatch Screen Shot 2019-04-11 at 09 07 18

You can check the peaks parameter in that command:

-peaks, /nfs/pride/prod/archive/1-20190403-PXD013343-RESUB-0001/submitted/qExPlus02_00186_OP9 total ex.mgf##/nfs/pride/prod/archive/1-20190403-PXD013343-RESUB-0001/submitted/
qExPlus02_00187_OP9 total ex rosi.mgf,

In this case, I see there are spaces in the filename of the mgf file

May be from our pipeline or px submisssion tool we should either restrict or escape spaces. I will add this to backlog. If you are going to correct it, then the problem is, you need to correct not only peak files name, but also there references in mzIdentML and also in submission.px

sureshhewabi commented 5 years ago

you need to look into the above command and check whats wrong with peak files if you do grep “<SpectraData ” qsdb-3_comprehensive-blib_5ppm.mzid

Screen Shot 2019-04-11 at 09 34 39

it shows 7 peak files in the mzIdentML file But if you check px summary file, only 2 peak files has been referred

Screen Shot 2019-04-11 at 09 06 27
sureshhewabi commented 5 years ago

@germa can we close this?

sureshhewabi commented 5 years ago

@germa I hope this is solved by now and closing the issue.