vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
283 stars 53 forks source link

Proteins.Identified == 0 in stats.tsv and pdf (1.9.2) #1231

Closed Kokoushin closed 3 weeks ago

Kokoushin commented 3 weeks ago

Hi Vadim,

The number of Proteins.Identified is 0 in stats.tsv and pdf report. But pg.matrix is normal. diann.log.txt

And I’m very surprised analysis speed is much faster than before.

Best, Ko

vdemichev commented 3 weeks ago

Hi Ko,

Happens because of this:

WARNING: protein inference is enabled but no FASTA provided - is this intended?

So the solution is to provide a FASTA database.

And I’m very surprised analysis speed is much faster than before.

Lots of work towards this :)

Best, Vadim

Kokoushin commented 3 weeks ago

Thank you for the quick response.

I have a new issue when i analyze the thermo files. DIA-NN stopped without any message.

Skyline not found MSFileReader found: MSFileReader Core 31

diann.exe --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_02.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_03.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_B_Sample_Alpha_01.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_B_Sample_Alpha_02.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_B_Sample_Alpha_03.raw " --lib "Z:\AD_MS-DATA\Ko\MS_FASTA\241025_human_ecoli_yeast_diann.speclib" --threads 16 --verbose 1 --out "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\DIA-NN\diann.tsv" --qvalue 0.01 --matrices --out-lib "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\DIA-NN\diann_lib.parquet" --gen-spec-lib --unimod4 --reanalyse --relaxed-prot-inf --rt-profiling DIA-NN 1.9.2 (Data-Independent Acquisition by Neural Networks) Compiled on Oct 17 2024 21:58:43 Current date and time: Tue Oct 29 17:33:15 2024 CPU: AuthenticAMD AMD Ryzen 9 5950X 16-Core Processor SIMD instructions: AVX AVX2 FMA SSE4.1 SSE4.2 SSE4a Logical CPU cores: 32 Thread number set to 16 Output will be filtered at 0.01 FDR Precursor/protein x samples expression level matrices will be saved along with the main report A spectral library will be generated Cysteine carbamidomethylation enabled as a fixed modification A spectral library will be created from the DIA runs and used to reanalyse them; .quant files will only be saved to disk during the first step Heuristic protein grouping will be used, to reduce the number of protein groups obtained; this mode is recommended for benchmarking protein ID numbers, GO/pathway and system-scale analyses The spectral library (if generated) will retain the original spectra but will include empirically-aligned RTs DIA-NN will optimise the mass accuracy automatically using the first run in the experiment. This is useful primarily for quick initial analyses, when it is not yet known which mass accuracy setting works best for a particular acquisition scheme. WARNING: protein inference is enabled but no FASTA provided - is this intended?

6 files will be processed [0:00] Loading spectral library Z:\AD_MS-DATA\Ko\MS_FASTA\241025_human_ecoli_yeast_diann.speclib WARNING: only in silico predicted libraries should be loaded in the .speclib format; in all other cases use the original .parquet or .tsv library [0:31] Library annotated with sequence database(s): Z:\AD_MS-DATA\Ko\MS_FASTA\241025_human_ecoli_yeast.fasta [0:31] Gene names missing for some isoforms [0:31] Library contains 31069 proteins, and 30100 genes [0:31] Spectral library loaded: 31069 protein isoforms, 41511 protein groups and 5958904 precursors in 1855970 elution groups. [0:32] Initialising library

First pass: generating a spectral library from DIA data

[0:42] File #1/6 [0:42] Loading run Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01.raw [2:03] 3646314 library precursors are potentially detectable [2:04] Calibrating with mass accuracies 30 (MS1), 20 (MS2) [2:35] RT window set to 6.06197 [2:35] Peak width: 6.124 [2:35] Scan window radius set to 13 [2:35] Recommended MS1 mass accuracy setting: 9.23086 ppm [3:37] Optimised mass accuracy: 17.3574 ppm [5:42] Removing low confidence identifications [5:43] Removing interfering precursors [5:48] Training neural networks on 196445 PSMs [5:52] Number of IDs at 0.01 FDR: 74530 [5:53] Calculating protein q-values [5:53] Number of genes identified at 1% FDR: 8846 (precursor-level), 7927 (protein-level) (inference performed using proteotypic peptides only) [5:53] Quantification [6:04] Quantification information saved to Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01.raw.quant

[6:04] File #2/6 [6:04] Loading run Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_02.raw

DIA-NN exited DIA-NN-plotter.exe "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\DIA-NN\diann.stats.tsv" "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\DIA-NN\diann.tsv" "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\DIA-NN\diann.pdf" PDF report will be generated in the background

diann.exe --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_02.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_03.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_B_Sample_Alpha_01.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_B_Sample_Alpha_02.raw " --f "Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_B_Sample_Alpha_03.raw " --convert --threads 16 --verbose 1 DIA-NN 1.9.2 (Data-Independent Acquisition by Neural Networks) Compiled on Oct 17 2024 21:58:43 Current date and time: Tue Oct 29 17:45:17 2024 CPU: AuthenticAMD AMD Ryzen 9 5950X 16-Core Processor SIMD instructions: AVX AVX2 FMA SSE4.1 SSE4.2 SSE4a Logical CPU cores: 32 MS data files will be converted to .dia format Thread number set to 16 WARNING: protein inference is enabled but no FASTA provided - is this intended?

6 files will be processed [0:00] Loading run Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01.raw [2:24] Loading run Z:\AD_MS-DATA\Ko\6_DIA_bench\QE_HF-X_PXD028735\LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_02.raw

DIA-NN exited

vdemichev commented 3 weeks ago

Would you please be able to upload LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_02.raw somewhere, I will take a look?

Kokoushin commented 3 weeks ago

After re-copying the file, it was solved. Thank you so much.

Best, Ko