vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
263 stars 53 forks source link

Visualization of all peptides and peakwidth batches #442

Open klannieha opened 2 years ago

klannieha commented 2 years ago

Hi, Is there a way to output all peptides with the chromatogram dataset?

On the other hand, I was looking at the peak width (RT.Stop - RT.Start) in my data using AnyLC (high accuracy) algorithm for quantification, there seems to have some batch effects with the peak extraction (attached png). I also tried using the high precision algorithms as well, but the batch phenomenon seems to persist. Is there any further commandline setting that has to be assigned?

Let me know! Thanks!

Annie

20220718_fragpipe_diann_AnyLCHighAcc_Peakwidth

singjc commented 2 years ago

Dear Vadim,

I have a very similar question to this, about returning the XICs for all the queried peptide ions across all runs. I have used the --viz [N], peptide1, peptide2, etc... flag to extract the XICs, but it does not seem to save XICs for all peptides across all runs. I supply all the peptides (~14,000) from the initial spectral library to --viz, but it only saves ~ 2,200 of these peptides to the *.XIC.tsv file. It also does not seem to save XICs for some runs, where I guess it doesn't identify a suitable peak for a run. Is there anyway to extract all XICs for all queried peptides for all runs even if it doesn't find any high quality peaks?

Best,

Justin

PS. @klannieha, I have a conversion script to convert DIA-NN's report and XIC tsv files to OpenMS's osw and sqMass file formats, if you still use TAPIR for XIC visualization.

vdemichev commented 2 years ago

Hi Annie,

We just use precursor-level batch correction. If not sufficient, can include batch info as covariates in downstream statistical tests. Peak width is a very minor contributor to batch effects, at least in our experience it's mostly the MS sensititivity.

Best, Vadim

vdemichev commented 2 years ago

Hi Justin,

Yes, --vis will only report identified precursors. Can however use some R or Python package to directly extract chromatograms from mzML.

Best, Vadim

klannieha commented 2 years ago

Thanks for the reply Vadim! It doesn't seem to be an issue to be too concerned with since it doesn't persist in the peak intensities. I have one more question about --vis, is it possible to translate the data point units to retention time during export?

vdemichev commented 2 years ago

Yes, --vis does save the RTs