bigbio / quantms

Quantitative mass spectrometry workflow. Currently supports proteomics experiments with complex experimental designs for DDA-LFQ, DDA-Isobaric and DIA-LFQ quantification.
https://quantms.org
MIT License
34 stars 37 forks source link

Plasma data LFQ failing #332

Open ypriverol opened 11 months ago

ypriverol commented 11 months ago

Description of the bug

ERROR ~ Error executing process > 'NFCORE_QUANTMS:QUANTMS:LFQ:PROTEOMICSLFQ (PXD002854-serum.sdrf_openms_design)'

Caused by:
  Process `NFCORE_QUANTMS:QUANTMS:LFQ:PROTEOMICSLFQ (PXD002854-serum.sdrf_openms_design)` terminated with an error exit status (11)

Command executed:

  ProteomicsLFQ \
      -threads 12 \
      -in 20140903_QEp1_LC7_PhGe_SA_4_48_Top6_Top20_2.mzML 20140903_QEp1_LC7_PhGe_SA_4_48_Top6_Top20_3.mzML 20141110_QEp1_LC7_PG_4_53_M1_1.mzML 20141110_QEp1_LC7_PG_4_53_M1_2.mzML 20141110_QEp1_LC7_PhGe_4_53_M1_3.mzML 20141110_QEp1_LC7_PhGe_4_53_M2_1.mzML 20141110_QEp1_LC7_PhGe_4_53_M2_2.mzML 20141110_QEp1_LC7_PhGe_4_53_M2_3.mzML 20141110_QEp1_LC7_PhGe_4_53_M3_1.mzML 20141110_QEp1_LC7_PhGe_4_53_M3_2.mzML 20141110_QEp1_LC7_PhGe_4_53_M3_3.mzML 20141110_QEp1_LC7_PhGe_4_53_M4_1.mzML 20141110_QEp1_LC7_PhGe_4_53_M4_2.mzML 20141110_QEp1_LC7_PhGe_4_53_M4_3.mzML 20141110_QEp1_LC7_PhGe_4_53_M5_1.mzML 20141110_QEp1_LC7_PhGe_4_53_M5_2.mzML 20141110_QEp1_LC7_PhGe_4_53_M5_3.mzML 20141110_QEp1_LC7_PhGe_4_53_W1_1.mzML 20141110_QEp1_LC7_PhGe_4_53_W1_2.mzML 20141110_QEp1_LC7_PhGe_4_53_W1_3.mzML 20141110_QEp1_LC7_PhGe_4_53_W2_1.mzML 20141110_QEp1_LC7_PhGe_4_53_W2_2.mzML 20141110_QEp1_LC7_PhGe_4_53_W2_3.mzML 20141110_QEp1_LC7_PhGe_4_53_W3_1.mzML 20141110_QEp1_LC7_PhGe_4_53_W3_2.mzML 20141110_QEp1_LC7_PhGe_4_53_W3_3.mzML 20141110_QEp1_LC7_PhGe_4_53_W4_1.mzML 20141110_QEp1_LC7_PhGe_4_53_W4_2.mzML 20141110_QEp1_LC7_PhGe_4_53_W4_3.mzML 20141110_QEp1_LC7_PhGe_4_53_W5_1.mzML 20141110_QEp1_LC7_PhGe_4_53_W5_2.mzML 20141110_QEp1_LC7_PhGe_4_53_W5_3.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B1.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B10.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B2.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B3.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B4.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B5.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B6.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B7.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B8.mzML 20150202_QEp2_LC11_PhGe_SA_4_57_B9.mzML 20150414_QEp1_LC7_GaPI_SA_Serum_DT_01_150417011711.mzML 20160112_QEp1_LC7_PhGe_SA_F13_1.mzML 20160112_QEp1_LC7_PhGe_SA_F13_2.mzML 20160112_QEp1_LC7_PhGe_SA_F13_3.mzML 20160112_QEp1_LC7_PhGe_SA_F14_1.mzML 20160112_QEp1_LC7_PhGe_SA_F14_2.mzML 20160112_QEp1_LC7_PhGe_SA_F14_3.mzML 20160112_QEp1_LC7_PhGe_SA_F1_1.mzML 20160112_QEp1_LC7_PhGe_SA_F1_2.mzML 20160112_QEp1_LC7_PhGe_SA_F1_3.mzML 20160112_QEp1_LC7_PhGe_SA_F5_1.mzML 20160112_QEp1_LC7_PhGe_SA_F5_2.mzML 20160112_QEp1_LC7_PhGe_SA_F5_3.mzML 20160112_QEp1_LC7_PhGe_SA_F8_1.mzML 20160112_QEp1_LC7_PhGe_SA_F8_2.mzML 20160112_QEp1_LC7_PhGe_SA_F8_3.mzML 20160112_QEp1_LC7_PhGe_SA_M10_1.mzML 20160112_QEp1_LC7_PhGe_SA_M10_2.mzML 20160112_QEp1_LC7_PhGe_SA_M10_3.mzML 20160112_QEp1_LC7_PhGe_SA_M11_1.mzML 20160112_QEp1_LC7_PhGe_SA_M11_2.mzML 20160112_QEp1_LC7_PhGe_SA_M11_3.mzML 20160112_QEp1_LC7_PhGe_SA_M18_1.mzML 20160112_QEp1_LC7_PhGe_SA_M18_2.mzML 20160112_QEp1_LC7_PhGe_SA_M18_3.mzML 20160112_QEp1_LC7_PhGe_SA_M20_1.mzML 20160112_QEp1_LC7_PhGe_SA_M20_2.mzML 20160112_QEp1_LC7_PhGe_SA_M20_3.mzML 20160112_QEp1_LC7_PhGe_SA_M2_1.mzML 20160112_QEp1_LC7_PhGe_SA_M2_2.mzML 20160112_QEp1_LC7_PhGe_SA_M2_3.mzML \
      -ids 20140903_QEp1_LC7_PhGe_SA_4_48_Top6_Top20_2_consensus_fdr_filter.idXML 20140903_QEp1_LC7_PhGe_SA_4_48_Top6_Top20_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PG_4_53_M1_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PG_4_53_M1_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M1_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M2_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M2_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M2_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M3_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M3_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M3_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M4_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M4_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M4_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M5_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M5_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_M5_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W1_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W1_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W1_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W2_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W2_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W2_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W3_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W3_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W3_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W4_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W4_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W4_3_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W5_1_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W5_2_consensus_fdr_filter.idXML 20141110_QEp1_LC7_PhGe_4_53_W5_3_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B10_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B1_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B2_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B3_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B4_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B5_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B6_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B7_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B8_consensus_fdr_filter.idXML 20150202_QEp2_LC11_PhGe_SA_4_57_B9_consensus_fdr_filter.idXML 20150414_QEp1_LC7_GaPI_SA_Serum_DT_01_150417011711_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F13_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F13_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F13_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F14_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F14_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F14_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F1_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F1_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F1_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F5_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F5_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F5_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F8_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F8_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_F8_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M10_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M10_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M10_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M11_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M11_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M11_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M18_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M18_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M18_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M20_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M20_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M20_3_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M2_1_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M2_2_consensus_fdr_filter.idXML 20160112_QEp1_LC7_PhGe_SA_M2_3_consensus_fdr_filter.idXML \
      -design PXD002854-serum.sdrf_openms_design.tsv \
      -fasta Homo-sapiens-uniprot-reviewed-entrap-contaminants-decoy-202310.fasta \
      -protein_inference aggregation \
      -quantification_method feature_intensity \
      -targeted_only false \
      -feature_with_id_min_score 0.10 \
      -feature_without_id_min_score 0.75 \
      -mass_recalibration false \
      -Seeding:intThreshold 1000 \
      -protein_quantification shared_peptides \
      -alignment_order star \
      -PeptideQuantification:quantify_decoys \
      -psmFDR 0.01 \
      -proteinFDR 0.01 \
      -picked_proteinFDR true \
      -out_cxml PXD002854-serum.sdrf_openms_design_openms.consensusXML \
      -out PXD002854-serum.sdrf_openms_design_openms.mzTab \
      -out_msstats PXD002854-serum.sdrf_openms_design_msstats_in.csv \
      -out_triqler PXD002854-serum.sdrf_openms_design_triqler_in.tsv \
      -debug 0 \
      2>&1 | tee proteomicslfq.log

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_QUANTMS:QUANTMS:LFQ:PROTEOMICSLFQ":
      ProteomicsLFQ: $(ProteomicsLFQ 2>&1 | grep -E '^Version(.*)' | sed 's/Version: //g' | cut -d ' ' -f 1)
  END_VERSIONS

Command exit status:
  11

Command output:

    100.00 %               
  -- done [took 09:50 m (CPU), 57.43 s (Wall)] -- 
  Found 102158 feature candidates in total.
  31327 features left after filtering.
  Model fitting: 24124 successes, 7203 failures
  Imputing model failures with a linear model based on log(rawIntensities). Slope: 0.885107, Intercept: 1.19213

  Summary statistics (counting distinct peptides including PTMs):
  3154 peptides identified (3154 internal, 0 additional external)
  3126 peptides with features (3126 internal, 0 external)
  28 peptides without features (28 internal, 0 external)

  Training SVM on 2000 observations. Classes:
  - '0.0': 1000 observations
  - '1.0': 1000 observations
  Optimizing parameters.
  Running cross-validation to find optimal parameters...
  Best cross-validation performance: 0.9315 (ties: 1)
  Best SVM parameters: log2_C = 11, log2_gamma = 0, log2_p = -3.32193
  ... done.
  Number of support vectors in the final model: 374
  Predicting class probabilities:
  Removed quant. targets with id (features with id) because of low quantification score: 100 of 3777     ( 2.6476% )
  Removed quant. targets with id (features without id) because of low quantification score: 8429 of 23259    ( 36.2397% )
  Removed quant. decoys (offset features) because of low quantification score: 4049 of 4291  ( 94.3603% )
  Progress of 'loading mzML':
    Progress of 'loading spectra list':

      8.33 %               
      68.10 %               
    -- done [took 11.41 s (CPU), 1.68 s (Wall)] -- 
    Progress of 'loading chromatogram list':

    -- done [took 0.02 s (CPU), 0.00 s (Wall)] -- 

  -- done [took 11.50 s (CPU), 1.69 s (Wall) @ 64.45 MiB/s] -- 
  Progress of 'picking peaks':

  -- done [took 0.03 s (CPU), 0.04 s (Wall)] -- 
  #Spectra that needed to and could be picked by MS-level:
    MS-level 1: 0 / 3431
    MS-level 2: 0 / 11602
  Correction to the highest intensity peak failed 145 times because of missing peaks in the MS1. No changes were applied in these cases.
  Info: Corrected 11457 precursors.
  Precursor correction:
    median        = 3.618540811542832e-10 ppm  MAD = 0.467633599971845
    median (abs.) = 0.467633600333699 ppm  MAD = 0.467633599539463
  /opt/conda/conda-bld/openms-meta_1697809676979/work/src/topp/ProteomicsLFQ.cpp(736): Exactly one protein identification run must be annotated in 20150202_QEp2_LC11_PhGe_SA_4_57_B10_consensus_fdr_filter.idXML
  ProteomicsLFQ took 07:06 m (wall), 50:41 m (CPU), 6.32 s (system), 50:35 m (user); Peak Memory Usage: 3106 MB.

Command wrapper:

    100.00 %               
  -- done [took 09:50 m (CPU), 57.43 s (Wall)] -- 
  Found 102158 feature candidates in total.
  31327 features left after filtering.
  Model fitting: 24124 successes, 7203 failures
  Imputing model failures with a linear model based on log(rawIntensities). Slope: 0.885107, Intercept: 1.19213

  Summary statistics (counting distinct peptides including PTMs):
  3154 peptides identified (3154 internal, 0 additional external)
  3126 peptides with features (3126 internal, 0 external)
  28 peptides without features (28 internal, 0 external)

  Training SVM on 2000 observations. Classes:
  - '0.0': 1000 observations
  - '1.0': 1000 observations
  Optimizing parameters.
  Running cross-validation to find optimal parameters...
  Best cross-validation performance: 0.9315 (ties: 1)
  Best SVM parameters: log2_C = 11, log2_gamma = 0, log2_p = -3.32193
  ... done.
  Number of support vectors in the final model: 374
  Predicting class probabilities:
  Removed quant. targets with id (features with id) because of low quantification score: 100 of 3777     ( 2.6476% )
  Removed quant. targets with id (features without id) because of low quantification score: 8429 of 23259    ( 36.2397% )
  Removed quant. decoys (offset features) because of low quantification score: 4049 of 4291  ( 94.3603% )
  Progress of 'loading mzML':
    Progress of 'loading spectra list':

      8.33 %               
      68.10 %               
    -- done [took 11.41 s (CPU), 1.68 s (Wall)] -- 
    Progress of 'loading chromatogram list':

    -- done [took 0.02 s (CPU), 0.00 s (Wall)] -- 

  -- done [took 11.50 s (CPU), 1.69 s (Wall) @ 64.45 MiB/s] -- 
  Progress of 'picking peaks':

  -- done [took 0.03 s (CPU), 0.04 s (Wall)] -- 
  #Spectra that needed to and could be picked by MS-level:
    MS-level 1: 0 / 3431
    MS-level 2: 0 / 11602
  Correction to the highest intensity peak failed 145 times because of missing peaks in the MS1. No changes were applied in these cases.
  Info: Corrected 11457 precursors.
  Precursor correction:
    median        = 3.618540811542832e-10 ppm  MAD = 0.467633599971845
    median (abs.) = 0.467633600333699 ppm  MAD = 0.467633599539463
  /opt/conda/conda-bld/openms-meta_1697809676979/work/src/topp/ProteomicsLFQ.cpp(736): Exactly one protein identification run must be annotated in 20150202_QEp2_LC11_PhGe_SA_4_57_B10_consensus_fdr_filter.idXML
  ProteomicsLFQ took 07:06 m (wall), 50:41 m (CPU), 6.32 s (system), 50:35 m (user); Peak Memory Usage: 3106 MB.

Work dir:
  /hps/nobackup/juan/pride/reanalysis/absolute-expression/platelet/PXD002854/work/7b/22ba5ee4f67533c21c9d709a143461

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details

Command used and terminal output

No response

Relevant files

No response

System information

No response

jpfeuffer commented 11 months ago

A Xmas bug? 😱 Merry Christmas @ypriverol 🎄