Nesvilab / FragPipe

A cross-platform proteomics data analysis suite
http://fragpipe.nesvilab.org
Other
200 stars 38 forks source link

DIA_SpecLib_Quant modifications #1222

Closed grandrea closed 1 year ago

grandrea commented 1 year ago

Hello,

I am trying to run the DDA/DIA worfflow "DIA_SpecLib_Quant" on fragpipe 20.0 I am searching both dda and a dia files, passing on the spectral library built from the dda runsto dia-nn. I define some custom modifications present in my sample in the dda side. The DDA search completes, but then dia-nn to crashes with "unknown modification". I see in the log that dia-nn gets launched with "strip unknown mods"... is this the reason? any idea on how to do dda, library generation and dia with custom mods? I do not need to replace library spectra with predicted.

Many thanks in advance!

Info: Library successfully generated. Done generating spectral library 2023-08-21 18:13:56,001:INFO:took 0:05:26.722944 Process 'SpecLibGen' finished, exit code: 0 DIA-NN [Work dir: ZZZ] C:\Soft\FragPipe-jre-20.0\fragpipe\tools\diann\1.8.2_beta_8\win\DiaNN.exe --lib library.tsv --threads 6 --verbose 1 --out diann-output\diann-output.tsv --qvalue 0.01 --matrix-qvalue 0.01 --matrices --no-prot-inf --smart-profiling --no-quant-files --peak-center --no-ifs-removal --predictor --dl-no-rt --dl-no-im --strip-unknown-mods --cfg ZZZfilelist_diann.txt DIA-NN 1.8.2 beta 8 (Data-Independent Acquisition by Neural Networks) Compiled on Sep 15 2022 18:28:57 Current date and time: Mon Aug 21 18:13:56 2023 CPU: AuthenticAMD AMD EPYC 7763 64-Core Processor SIMD instructions: AVX AVX2 FMA SSE4.1 SSE4.2 SSE4a Logical CPU cores: 8 Thread number set to 6 Output will be filtered at 0.01 FDR Precursor/protein x sample matrices will be filtered at 0.01 precursor & protein-level FDR Precursor/protein x samples expression level matrices will be saved along with the main report Protein inference will not be performed When generating a spectral library, in silico predicted spectra will be retained if deemed more reliable than experimental ones .quant files will not be saved to the disk Fixed-width center of each elution peak will be used for quantification Interference removal from fragment elution curves disabled Deep learning will be used to generate a new in silico spectral library from peptides provided RTs will not be predicted using deep learning IMs will not be predicted using deep learning DIA-NN will use deep learning to predict spectra/RTs/IMs even for peptides carrying modifications which are not recognised by the deep learning predictor. In this scenario, if also generating a spectral library from the DIA data or using the MBR mode, it might or might not be better (depends on the data) to also use the --out-measured-rt option - it's recommended to test it with and without this option DIA-NN will optimise the mass accuracy automatically using the first run in the experiment. This is useful primarily for quick initial analyses, when it is not yet known which mass accuracy setting works best for a particular acquisition scheme.

6 files will be processed [0:00] Loading spectral library library.tsv [0:00] Finding proteotypic peptides (assuming that the list of UniProt ids provided for each peptide is complete) [0:00] Spectral library loaded: 25 protein isoforms, 25 protein groups and 7235 precursors in 5460 elution groups. [0:00] Encoding peptides for spectra and RTs prediction [0:00] Predicting spectra and IMs [0:01] Decoding predicted spectra and IMs ERROR: C:\diann\src\diann.cpp: 2774: unknown modification: 183.08948 Process 'DIA-NN' finished, exit code: -1 Process returned non-zero exit code, stopping

fcyu commented 1 year ago

When there are modifications that can't be recognized by DIA-NN, you need to specify it using the --var-mod flag: https://github.com/vdemichev/DiaNN#command-line-reference, although I am not sure why you still got the error with the --strip-unknown-mods. This flag is suppose to let DIA-NN ignore any "unknown modifications".

Best,

Fengchao

grandrea commented 1 year ago

Thanks for the help!

just to confirm- I run dia-nn with the flags

--var-mod custom1,82.0418,KSTY --var-mod custom2:100.0524,KSTY --var-mod custom3,295.0362,KSTY --var-mod custom4,202.107934,KSTY  --var-mods 3  

and there is a mismatch between mod masses in the log

Modification custom1 with mass delta 82.0418 at KSTY will be considered as variable WARNING: no amino acids to be modified, modification ignored Modification custom2 with mass delta 295.036 at KSTY will be considered as variable Modification custom3 with mass delta 202.108 at KSTY will be considered as variable Maximum number of variable modifications set to 3 DIA-NN will optimise the mass accuracy automatically using the first run in the experiment. This is useful primarily for quick initial analyses, when it is not yet known which mass accuracy setting works best for a particular acquisition scheme. Cannot find a UniMod modification match for custom1: 2.07547 minimal mass discrepancy; using the original modificaiton name Cannot find a UniMod modification match for custom2: 46.0068 minimal mass discrepancy; using the original modificaiton name Cannot find a UniMod modification match for custom3: 138.935 minimal mass discrepancy; using the original modificaiton name

This seems fixed when I add

--original-mods

as a flag. I suppose that's ok?

fcyu commented 1 year ago

Yes, --original-mods will also solver your problem in another way: --original-mods disables the automatic conversion of known modifications to the UniMod format names

There is also a --mod flag which seems related: --mod [name],[mass],[optional: 'label'] declares a modification name. Examples: "--mod UniMod:5,43.005814", "--mod SILAC-Lys8,8.014199,label".

To be honest, I always can't distinguish those three flags....

Best,

Fengchao