cbielow / PTXQC

A Quality Control (QC) pipeline for Proteomics (PTX) results generated by MaxQuant
Other
42 stars 25 forks source link

New metric implementation - visualizing iRTs peptides #83

Open MiguelCos opened 4 years ago

MiguelCos commented 4 years ago

Hello,

We use iRT peptides in almost all our sample preps as a quality control for both our chromatography and identifications.

It would be interesting for us to be able to include a couple of plots in the PTXQC report with information regarding iRT peptides intensity over RT.

I would love to be able to make this as a contribution to the package, but I have to say that I am not experienced on package development and I wouldn't know where to start and how to add a particular modification like this into the existing pipeline. I would really appreciate any guidance on this matter if possible.

Best wishes, Miguel

cbielow commented 4 years ago

Hi Miguel,

this sounds like a very useful metric to have. I'd be happy to help you with integrating it.

The rough way to go about it would be to: 1) implement the metric (i.e. the R5-Reference class which gets some data as input, and produces a plot). There are 20-odd such metrics available. The data to do this with is probably in the evidence.txt (in MaxQuant terms; for mzTab it will be converted into the same datastructure). Just copy-paste from the closest Evidence metric (in spirit), probably the Peptide Intensity Metric, see https://github.com/cbielow/PTXQC/blob/master/R/qcMetric_EVD.R#L186. You'll need to work with different columns, probably "fc.raw.file" (name of Raw file), "modified.sequence" (or just "sequence") and "intensity", but they are all present in the data.frame already. 2) call the metric, see https://github.com/cbielow/PTXQC/blob/master/R/createReport.R#L428 for an example. The identifier which is used here ("qcMetric_EVD_PeptideInt") is identical to the one specified in step 1) above (just make something up).

Now, the more interesting question is what exaclty you want to plot, and (if applicable) use as scoring function for the heatmap. You could plot the raw intensities of your 11(?) iRT peptides for each RawFile, or use ratios (using the most abundant Raw file as reference) or scale the most abundant iRT peptide to 1 and give the intensity of the remaining one etc... I have not seen enough data to make a good decision here. The sequence of the iRT peptides is probably something that needs to be provided via a the YAML config file (by putting the sequences in there directly) or search for a "iRT_peptides.list" (or something along those lines) in the input directory, which provides the iRT sequences. For the moment, you also hardcode them within the metric. Usually, its preferable to avoid user-parameters whenever possible for scoring. So for example, a parameter-free solution would be to score the number of observed iRT peptides per Raw file. Using a single parameter, one could require a minimum intensity (which will need to be adapted by the user depending on instrument and setup). If you use a default of 0, that would be equivalent to the first solution, but adds a bit of flexibility... But maybe you want to score something completely different??

my 50cents...

cheers Chris

MiguelCos commented 4 years ago

Hello Chris,

Many thanks for your answer and for your guidance, it would definitely make my life easier for navigating through the package structure and to know where to begin.

The main idea that we have is to just plot the measured RT against the standard iRT for every peptide standard in the sample and maybe include a fitted line with an R^2 measure. The values of the peptide standards would be always the same, so as you suggest, I would just hardcode them in the plotting function initially. That should be easy to implement with ggplot2. The tricky part for me would be to make this configurable through the YAML file but I guess I could think about that later on.

I already forked the project and started looking at it. I'll be back here asking questions as I make some progress.

Thanks again and best wishes, Miguel

svalvaro commented 3 years ago

Hello @MiguelCos,

I have developed a package that complements PTXQC, (there are so many possible metrics that even more packages are needed to cover the whole Quality Control in Proteomics). It is called MQmetrics. And due to this issue, I decided to implement those two plots that you mentioned since we also use iRT peptides in our lab.

You can find the package at MQmetrics.

I hope it helps!

Cheers,

Alvaro

MiguelCos commented 3 years ago

Hello @svalvaro

This looks amazing. Many thanks for sharing. We would definitely give it a try at some point.

Best wishes, Miguel

svalvaro commented 3 years ago

Hello @MiguelCos

Thanks and you're welcome, any feedback is highly appreciated :)