vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
283 stars 53 forks source link

IP experiment with DIA quntification #1136

Open wangrui85 opened 3 months ago

wangrui85 commented 3 months ago

Dear Vadim,

So great help in our analysis of DIA worflow!!! I have some details during my AP-MS analysis 1) If I need no-normalized quantification, could I replace "Precursor.Normalised" with "Precursor.Quantity": protein.groups_Nonormalised <- diann_maxlfq(df[df$Q.Value <= 0.01 & df$PG.Q.Value <= 0.01,], group.header="Protein.Group", id.header = "Precursor.Id", quantity.header = "Precursor.Quantity") Actually, I found in issue #1056 "diann_maxlfq implements a simple MaxLFQ algorithmdifferent from what DIA-NN uses internally", so I did get a different "protein.groups" result compared with "report.pg_matrix". But how I could use for futher pg quantification without normalised? 2) I found that in previous issue, you suggest a MBR and normalize in searching. Are they compatible with enrich experiment? Or do I need close one of them in enrich proteome? 3) note about imputation "when a protein is completely absent in some of the biological conditions), we prefer to perform it on the protein level". So how I could process the NA value? calulate average or median among valid values? or like LFQ, imputation followed by filter on valid values? sincerely, Zoe

vdemichev commented 3 months ago

Hi Zoe,

could I replace "Precursor.Normalised" with "Precursor.Quantity"

Yes.

But how I could use for futher pg quantification without normalised?

As you've indicated above or by disabling normalisation in the DIA-NN GUI. For AP-MS disabling normalisation makes sense.

MBR and normalize in searching

Not sure what you are referring to. For AP-MS, it definitely makes sense to (i) use MBR, (ii) disable normalisation in DIA-NN GUI.

So how I could process the NA value?

In AP-MS you probably don't want to impute at all. But if you need to, because you'd like to use some downstream processing that requires complete profiles, minimal-value imputation on the protein level makes sense.

Best, Vadim

wangrui85 commented 3 months ago

Vadim, Thanks so much for your great support!

Does it mean that maxLFQ from “pg.maxtri.tsv” could be used directly under disabling normalize? A little confusing from report.tsv to pg.matrix. if necessary, should I use “iq” in R to process these precusors for pg quantification result?In your DIA_NN package, it's really to get same pg.maxtrix.

Thanks again

Zoe

from 阿里邮箱 iPhone ------------------原始邮件 ------------------ 发件人:Vadim Demichev @.> 日期:Mon Aug 19 18:58:07 2024 收件人:vdemichev/DiaNN @.> 抄送人:wangrui85 @.>, Author @.> 主题:Re: [vdemichev/DiaNN] IP experiment with DIA quntification (Issue #1136)

Hi Zoe, could I replace "Precursor.Normalised" with "Precursor.Quantity" Yes. But how I could use for futher pg quantification without normalised? As you've indicated above or by disabling normalisation in the DIA-NN GUI. For AP-MS disabling normalisation makes sense. MBR and normalize in searching Not sure what you are referring to. For AP-MS, it definitely makes sense to (i) use MBR, (ii) disable normalisation in DIA-NN GUI. So how I could process the NA value? In AP-MS you probably don't want to impute at all. But if you need to, because you'd like to use some downstream processing that requires complete profiles, minimal-value imputation on the protein level makes sense. Best, Vadim — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

vdemichev commented 3 months ago

Hi Zoe,

Does it mean that maxLFQ from “pg.maxtri.tsv” could be used directly under disabling normalize?

Yes, although you might want to add --matrix-spec-q 0.01 in this case to Additional options, if you want to use the pg_matrix.

To reproduce pg_matrix from the main report you need to apply filtering as described in https://github.com/vdemichev/DiaNN?tab=readme-ov-file#output and then transform the dataframe from long to wide format (e.g. using diann_matrix).

Best, Vadim

zoe1985 commented 2 months ago

Hi Zoe,

Does it mean that maxLFQ from “pg.maxtri.tsv” could be used directly under disabling normalize?

Yes, although you might want to add --matrix-spec-q 0.01 in this case to Additional options, if you want to use the pg_matrix.

To reproduce pg_matrix from the main report you need to apply filtering as described in https://github.com/vdemichev/DiaNN?tab=readme-ov-file#output and then transform the dataframe from long to wide format (e.g. using diann_matrix).

Best, Vadim

Hi,Vadim,

When I read “report.parquet" in R (although TAD could also do it), It's alwasys failed while process_long format:**

df<-read_parquet("report.parquet") ###ok process_long_format("report.parquet", output_filename = "report-pg-global.tsv", sample_id = "Run", primary_id = "Protein.Group", secondary_id = "Precursor.Id", intensity_col = "Precursor.Quantity", annotation_col = c("Protein.Names", "Genes"), filter_double_less = c("Q.Value" = "0.01", "Lib.PG.Q.Value" = "0.01")) #### failed

Sincerely,

Rui