Open misak-acrivon opened 1 month ago
Hi Marc,
You can use the 0_9 and 0_99 phosphosite matrices, i.e. those summarise things in a convenient format.
Best, Vadim
Unfortunately, I would like to derive these tables for localisation scores >= 0.75. So I guess I need to start from the parquet DIA-NN report?
I would like to derive these tables for localisation scores >= 0.75
The scores for the tables produced by DIA-NN are 0.9 and 0.99 respectively
So I guess I need to start from the parquet DIA-NN report?
You can if you would like to do some fancy filtering. Can you please elaborate what is not clear about the .parquet report contents? That is there's a column identifying the sites & column with their localisation scores + Peptidoform.Q.Value and/or Lib.Peptidoform.Q.Value are also good to use for filtering. That's basically it, nothing sophisticated really with those tables DIA-NN produces :)
These columns: They are only included in the report if phospho is declared as --var-mod and peptidoform scoring is enabled (the GUI does this by default if ticking the phospho option).
Thanks Vadim,
the problem is that even if I filter based on these columns, I cannot recreate the phospho site tables given by DIA-NN at localization scores of 0.9 and 0.99 respectively. It means I am doing something wrong here, or that I am not accounting for some information when filtering.
If not too inconvenient, could you perhaps take a look at my small R-script (attached as .txt since .R is not a supported file type) that runs the filtering and spot where things might go wrong? I suppose it happens under the section 'Extract phospho sites passing filtering criteria...'
Thank you in advance!
Marc
I cannot recreate the phospho site tables given by DIA-NN
The exact filters are:
But I suggest not to try to reproduce those matrices. The question how to best quantify phosphosites (with what filtering) is an open question, there's no definitive answer in literature. What you come up with by try different filters on a specific experiment can as well end up better than what DIA-NN does by default when generating those matrices.
Best, Vadim
Thanks Vadim!
I am definitely getting closer after you specified the new filters, but the matrices are not the same. I agree with you that tweaking of filters perhaps could generate better reports than what DIA-NN does for specific experiments. However, it would be nice to have baseline settings established that agree with the DIA-NN output as a sanity check before tweaking of settings.
I know that I am being annoying right now, but would you have the possibility to look at my updated R-script? Just to quickly see that I am not doing anything wrong now when the new filters have been added.
No worries :)
siteConfidence = diannReport$PTM.Site.Confidence >= 0.9
DIA-NN looks at confidence of individual sites in the next column, whereas PTM.Site.Confidence is the 'worst site confidence'. So like this you will get less hits.
Hi Vadim,
I have now tried using the R-script that I shared with you earlier on another dataset, but I am getting very different results as compared to the phosphosites tables that are output automatically with a DIA-NN run. So there must still be something that I am doing vastly different from DIA-NN when creating this table. I understand that it could take you a lot of time to dig into what I might do wrong in the shared script.
It would be great if the 'diann-rpackage' that you developed some time ago could have a function to derive any p-site table from the parquet report similar to the ones automatically created by DIA-NN. Would this be an interesting feature to add?
Thanks for all the help so far.
Hi Marc,
We in general plan to overhaul the R package, but this will not happen within the next two months. About vastly different results, this means either the DIA-NN code not corresponding to the specification here https://github.com/vdemichev/DiaNN/issues/1174#issuecomment-2360303400 or the script you use. So in case there's a discrepancy (any whatsoever), either one or the other is not conforming to the specification. This can easily be checked manually (just for a single peptide ID in a single run) - if it's DIA-NN that produces a different result, I can take a look why exactly this happens.
Best, Vadim
Hi Vadim,
I am a little curious in knowing how to derive the phospho site tables from the main DIA-NN report? Which columns and respective filters would I need to apply to derive the table. I realize that I would need to reshape the report from long to wide format in the end after applying these filters as well.
Thank you very much in advance!
Marc