vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
278 stars 54 forks source link

Questions about parameter setting as well as quantification #853

Open hahahahhhhahaha opened 12 months ago

hahahahhhhahaha commented 12 months ago

Dear Vadim,

Thanks for your such a great software!But I have had a couple of issues with it.

  1. I have used FASTA digest and deep-learning options to generate .predict.speclib file based on the FASTA(from Uniprot), so when I import the .predict.speclib file to analyze the raw data, do I still need to add the FASTA file, do I still need to select "generate spectral library"?
  2. I have a FASTA file that format is different from the uniprot format, which makes the protein and gene columns in the report very strange, I flipped through the other answers, one of the ways is to select "Isoform IDs", but it is still abnormal.n addition, when do I need to use "reannotate"?
  3. The above FASTA files are recommended to use ”smart profiling“,but the human general library in SWATH Atlas is a high-quality specific library, and does it need to use "IDs profiling"?
  4. Why the number of proteins in the report.tsv and the pg_matrix.tsv is not the same. I used R to screen the protein.group columns for the report.tsv Proteotypic==1&Q.value<0.01 (I couldn't pick the Gene column because the Gene column was blank or abnormal), but it was only about the same number of proteins identified in the stats.tsv. So in what case to use pg_matrix.tsv, or how to make the number of proteins on the two reports the same?

Here's my log docs: diann.exe --f E:\WXY\10ng SWATH FASTA change\91452_20201215_10ng_SWATH_HumanAltas_01_MSPLITfiltered.wiff --f E:\WXY\10ng SWATH FASTA change\91454_20201215_10ng_SWATH_HumanAltas_02_MSPLITfiltered.wiff --f E:\WXY\10ng SWATH FASTA change\91456_20201215_10ng_SWATH_HumanAltas_03_MSPLITfiltered.wiff --lib E:\WXY\10ng SWATH FASTA change\library\report-lib.predicted.speclib --threads 4 --verbose 3 --out E:\WXY\10ng SWATH FASTA change\report.tsv --qvalue 0.01 --matrices --out-lib E:\WXY\10ng SWATH FASTA change\report-lib.tsv --gen-spec-lib --fasta E:\WXY\10ng SWATCH\HEK293_RefV57_cRAPgeneA_20130129.fasta --met-excision --cut K,R,!*P --window 1 --mass-acc 10 --mass-acc-ms1 5 --reanalyse --relaxed-prot-inf --rt-profiling --pg-level 0 --peak-center --no-ifs-removal --peak-translation --original-mods --report-lib-info --ms1-isotope-quant --ms1-subtract 2

Best regards,

hah

vdemichev commented 11 months ago

Hi hah,

  1. Add FASTA, select 'MBR', do not uncheck any other tickboxes that get selected automatically.
  2. Yes, select Isoform IDs and rely on isoform IDs only for downstream analysis or pull protein and gene names from FASTA after DIA-NN analysis using R or Python package for reading FASTAs.
  3. IDs, RTs and IMs profiling is the recommended mode. MBR is recommended for anything except highly specific libraries, the SWATH Atlas is not highly specific, hence please do use MBR.
  4. Please see https://github.com/vdemichev/DiaNN#output

Best, Vadim