Nesvilab / FragPipe

A cross-platform Graphical User Interface (GUI) for running MSFragger and Philosopher - powered pipeline for comprehensive analysis of shotgun proteomics data
http://fragpipe.nesvilab.org
Other
181 stars 37 forks source link

Which UniMod PTMs can FragPipe's DIA-NN handle? #1437

Closed dtabb73 closed 4 months ago

dtabb73 commented 6 months ago

Hello, I was really glad to see the DIA features incorporated in FragPipe.

Our laboratory uses MMTS rather than iodoacetamide to alkylate Cys side chains. (https://www.unimod.org/modifications_view.php?editid1=39) I was able to build the library in FragPipe just fine, but it appears from the log that its DIA-NN cannot handle this mass shift:

55 files will be processed [0:00] Loading spectral library D:\lysoIP\20240201-ABCDGH-SpecLib\library.tsv [0:04] Finding proteotypic peptides (assuming that the list of UniProt ids provided for each peptide is complete) [0:04] Spectral library loaded: 5218 protein isoforms, 5218 protein groups and 80406 precursors in 66286 elution groups. [0:04] Encoding peptides for spectra and RTs prediction [0:04] Predicting spectra and IMs [1:15] Decoding predicted spectra and IMs ERROR: C:\diann\src\diann.cpp: 2774: unknown modification: UniMod:39 Process 'DIA-NN' finished, exit code: -1 Process returned non-zero exit code, stopping

Is there a way for DIA-NN to handle this static mass change correctly? Is there a hard-coded lists of UniMod PTMs that it can handle?

Thanks, Dave log_2024-02-06_15-27-23.txt

fcyu commented 6 months ago

Hi Dave,

It is the same as the standalone DIA-NN. You could use the --mod flag to let DIA-NN know the modifications.

Best,

Fengchao

dtabb73 commented 6 months ago

20240207-FragPipe-DIA-NN-PTM

Hi, Fengchao. I was unsure how to pass that flag via the FragPipe interface at first. I used the DIA-NN documentation to find how to format a "--mod" flag (https://github.com/vdemichev/DiaNN?tab=readme-ov-file#ptms). Then for some reason FragPipe didn't like my version of the Thermo RAW reader, so I fed it mzMLs instead. Now the DIA-NN run seems to be working smoothly!

Thank you for the timely help!

fcyu commented 6 months ago

Hi Dave,

Thanks for the feedback. I also noticed that the DIA-NN version in FragPipe couldn't read raw file. I am not sure about the exact reason. Need to investigate in the future.

Best,

Fengchao

fcyu commented 4 months ago

I tested using one raw file and it actually works

DIA-NN [Work dir: G:\ttt]
G:\Dropbox\code\FragPipe\MSFragger-GUI\tools\diann\1.8.2_beta_8\win\DiaNN.exe --lib G:\dev\test_msstatsptm_dia\library.tsv --threads 11 --verbose 1 --out diann-output\report.tsv --qvalue 0.01 --matrices --no-prot-inf --smart-profiling --no-quant-files --peak-center --no-ifs-removal --report-lib-info --cfg G:\ttt\filelist_diann.txt
DIA-NN 1.8.2 beta 8 (Data-Independent Acquisition by Neural Networks)
Compiled on Sep 15 2022 18:28:57
Current date and time: Sun Mar 31 09:43:25 2024
CPU: GenuineIntel Intel(R) Xeon(R) W-2235 CPU @ 3.80GHz
SIMD instructions: AVX AVX2 AVX512CD AVX512F FMA SSE4.1 SSE4.2 
Logical CPU cores: 12
Thread number set to 11
Output will be filtered at 0.01 FDR
Precursor/protein x samples expression level matrices will be saved along with the main report
Protein inference will not be performed
When generating a spectral library, in silico predicted spectra will be retained if deemed more reliable than experimental ones
.quant files will not be saved to the disk
Fixed-width center of each elution peak will be used for quantification
Interference removal from fragment elution curves disabled
DIA-NN will optimise the mass accuracy automatically using the first run in the experiment. This is useful primarily for quick initial analyses, when it is not yet known which mass accuracy setting works best for a particular acquisition scheme.

1 files will be processed
[0:00] Loading spectral library G:\dev\test_msstatsptm_dia\library.tsv
[0:01] Finding proteotypic peptides (assuming that the list of UniProt ids provided for each peptide is complete)
[0:01] Spectral library loaded: 5757 protein isoforms, 5757 protein groups and 35098 precursors in 33768 elution groups.
[0:01] Initialising library
[0:01] Saving the library to G:\dev\test_msstatsptm_dia\library.tsv.speclib

[0:01] File #1/1
[0:01] Loading run F:\data\ccrcc_dia_discovery_proteome_20\raw\CPTAC_CCRCC_W_JHU_20190112_LUMOS_C3L-00010_NAT.raw
[0:27] 35098 library precursors are potentially detectable
[0:27] Processing...
[1:10] RT window set to 8.19863
[1:10] Peak width: 4.324
[1:10] Scan window radius set to 9
[1:10] Recommended MS1 mass accuracy setting: 9.36573 ppm
[1:56] Optimised mass accuracy: 6.97601 ppm
[2:08] Removing low confidence identifications
[2:08] Removing interfering precursors
[2:08] Training neural networks: 27306 targets, 30498 decoys
[2:12] Number of IDs at 0.01 FDR: 391
[2:12] Calculating protein q-values
[2:12] Number of protein isoforms identified at 1% FDR: 208 (precursor-level), 207 (protein-level) (inference performed using proteotypic peptides only)
[2:12] Quantification

[2:12] Cross-run analysis
[2:12] Reading quantification information: 1 files
[2:12] Quantifying peptides
[2:12] Quantifying proteins
[2:12] Calculating q-values for protein and gene groups
[2:12] Calculating global q-values for protein and gene groups
[2:12] Writing report
[2:12] Report saved to diann-output\report.tsv.
[2:12] Saving precursor levels matrix
[2:12] Precursor levels matrix (1% precursor and protein group FDR) saved to diann-output\report.pr_matrix.tsv.
[2:12] Saving protein group levels matrix
[2:12] Protein group levels matrix (1% precursor FDR and protein group FDR) saved to diann-output\report.pg_matrix.tsv.
[2:12] Saving gene group levels matrix
[2:12] Gene groups levels matrix (1% precursor FDR and protein group FDR) saved to diann-output\report.gg_matrix.tsv.
[2:12] Saving unique genes levels matrix
[2:12] Unique genes levels matrix (1% precursor FDR and protein group FDR) saved to diann-output\report.unique_genes_matrix.tsv.
[2:12] Stats report saved to diann-output\report.stats.tsv
[2:12] Log saved to diann-output\report.log.txt
Finished

Process 'DIA-NN' finished, exit code: 0

Have you installed the Thermo MS File Reader ?

Thanks,

Fengchao