Nesvilab / MSFragger

Ultrafast, comprehensive peptide identification for mass spectrometry–based proteomics
https://msfragger.nesvilab.org
105 stars 7 forks source link

Error in the use of DIA-Umpire for non-specific search and library free DIA analysis of HLA peptidomes #268

Closed shahbazymoh closed 1 year ago

shahbazymoh commented 1 year ago

Dear MSFragger team,

I am trying to run DIA-Umpire for HLA peptidomics for library-free DIA data processing. I started with a small dataset (three bio-reps of a DIA dataset containing immunopeptidomes). The DIA-Umpire data processing seems smooth, and the FragPipe could complete that, but in the step of DB search, MSFragger showed this message first "Not enough data to perform mass calibration. Using the uncalibrated data." in the MASS CALIBRATION AND PARAMETER OPTIMIZATION step, and after few minutes it was stopped with this error "Process 'MSFragger' finished, exit code: 1 - Process returned non-zero exit code, stopping ~~~ Cancelling 46 remaining tasks". I changed the MS2 error (in the MSFragger tab) from 20 to 200 ppm, but it again showed the same error. Please see the attached log file.

Note 1: The data was acquired by Orbitrap Fusion.
Note 2: I used the "DIA_DIA-Umpire_SpecLib_Quant" workflow and slightly modified that based on a regular non-specific MSFragger DB search setting used and tested for the DDA datasets in immunopeptidomics".

Do I need to change anything in the setting? Please help and guide me.

Many thanks in advance, Moh

log_2023-07-01_12-20-20.txt

fcyu commented 1 year ago

Hi Moh,

MSFragger showed this message first "Not enough data to perform mass calibration. Using the uncalibrated data." in the MASS CALIBRATION AND PARAMETER OPTIMIZATION step

It is not an error. The mass calibration failed because there were not many IDs from your sample. You probably need to troubleshoot your sample preparation.

after few minutes it was stopped with this error "Process 'MSFragger' finished, exit code: 1 - Process returned non-zero exit code, stopping ~~~ Cancelling 46 remaining tasks".

There was not enough menory: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space. You can solve this problem by setting the MSFragger's split database from 1 to, say, 8.

Best,

Fengchao

shahbazymoh commented 1 year ago

Hi Fengchao, @fcyu

Thanks for the rapid reply.

1) The dataset is fine, and I tested this many times as a library-based search (3000-4000 HLA peptides were identified when I used a pre-defined spectral library). 2) I have tried to use the split database already, but it showed the following error (I am trying to use only DIA data for a fully-library/DDA-free search) once I push Run:

Capture

What do you think?

Best, Moh

fcyu commented 1 year ago

Hi Moh,

I have tried to use the split database already, but it showed the following error (I am trying to use only DIA data for a fully-library/DDA-free search) once I push Run:

Oh, sorry that I forgot this limitation. Could you load the _Q1.mzML, _Q2.mzML, and _Q3.mzML files but not the original mzML files. And then, specify them to DDA data type. With this workaround, you won't be able to perform the quantification but can get a library.tsv file after FragPipe finishes. Then, remove all _Qx.mzML files, load the original mzML files, set them to DIA data type, disable all tools except for DIA-NN quantification, and run FragPipe again.

Or, find a computer with more memory. I see that there are only 16 GB free memory, which seems a little small for non-specific search.

Best,

Fengchao

shahbazymoh commented 1 year ago

Hi Fengchao,

Thanks for the useful guidance and help.

I try this and update you. Another question: Since I would search the Orbitrap dataset, do I need to amend the advanced setting in the DIA-Umpire tab?

Kind regards, Moh

Capture2

fcyu commented 1 year ago

Hi Moh,

I don't think so.

In case I didn't explain it clearly, with _Qx.mzML files as input, you do not need to run DIA-Umpire again.

Best,

Fengchao

shahbazymoh commented 1 year ago

Hi Fengchao,

No, I got it already and have not re-run DIA Umpire. I meant since I faced this "Not enough data to perform mass calibration. Using the uncalibrated data.", then, it seems DIA-Umpire is not able to generate pseudo-spectra with enough quality to be used as -Qx.mxML input for the MSFragger. Therefore, if we slightly modify the DIA-Umpire search setting, that may resolve the issue. Am I right? Did you have any experience using DIA-Umpire for a fully-library-free DIA data processing for non-specific HLA peptidomics?

Cheers, Moh

fcyu commented 1 year ago

HI Moh,

I am not sure if modifying the DIA-Umpire parameters would help because the default one have been well tested using Orbitrap data.

Best,

Fengchao

shahbazymoh commented 1 year ago

Hi Fengchao,

Many thanks, @fcyu

Aha, ok. I also try to use another dataset (maybe Zeno SWATH from 7600) to see what happens.

All the best, Moh