Closed tobiasko closed 4 years ago
Tobi, Fengchao and I are looking into this. Will let you known when we have an answer for what is happening. Thanks Alexey
Sent from my iPhone
On Apr 22, 2020, at 5:08 AM, Tobias Kockmann notifications@github.com wrote:
External Email - Use Caution
The raw files we analysed can be found at PXD010012https://www.ebi.ac.uk/pride/archive/projects/PXD010012.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/IonQuant/issues/3#issuecomment-617653815, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM67NHPS4UGQPWU7EIK3RN2XZRANCNFSM4MNDQDJQ.
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Hi @prvst,
ok. Just let me know if you need something. log files, code... but you should be able to reproduce this without any special parameter choices.
Best, Tobi
Hi @anesvi , Hi @fcyu,
I got some feedback from Florian Meier. He pointed out that the preprint version of the paper contains a typo. The true ratio for Ecoli should be 1:5, not 1:4 as I stated above. The typo in the legend of Fig. 5 was corrected in the final version. Here is the results section:
"To further benchmark the quantitative accuracy of our setup, we mixed tryptic digests from HeLa and Escherichia coli in 1:1 and 1:5 ratios and measured each sample in quintuplicate 120 min single runs. Overall, we quantified 6135 protein groups (5407 HeLa; 728 E. coli) with at least one valid value for both mixing ratios. Plotting the median fold-changes yielded two distinct clouds for HeLa and E. coli proteins, which were 4.3-fold separated in abundance, slightly less than the intended 5-fold mixing ratio (Fig. 5D). Both populations were narrow (σ(HeLa) = 0.44; σ(E. coli) = 0.77) relative to the expected fold-change and they had minimal overlap. Considering only the 5686 proteins with at least two valid values for each mixing ratio (5052 HeLa, 634 E. coli), a one-sided Student's t test returned 602 significantly changing E. coli proteins at a permutation-based FDR below 0.05. This represents an excellent sensitivity of ∼95% and only 64 human proteins (1.3%) were falsely classified as changing. From these results, we conclude that the combination of TIMS and PASEF provides precise and accurate label-free protein quantification at a high level of data completeness."
But they used the distance between the median value for Hs and the median value of Ecoli instead of comparing to abs. expectations (analysing residuals). The reason is the nature of the 2plex hybride proteome, which most likely affects the assumptions used during normalisation. He pointed out that this is also visible in the maxLFQ paper:
I saw similar effects for the FragPipe->MSstats dataset. The Hs peptides are also not exactly centered on zero. So maybe one needs to use only Hs peptides during normalization in MSstats to get more accurate FC estimates.
Cheers, Tobi
An alternative dataset to access accuracy might be PXD014777. It is a LFQbench style triple hybride proteome. Have you tried this one? It was used to benchmark MQ 1.6.6
Yes, we looked at that dataset. We see some strange things too. We are currently trying to understand some weird behavior of intensities in these data. Thanks, Alexey
From: Tobias Kockmann notifications@github.com Sent: Wednesday, April 22, 2020 11:11 AM To: Nesvilab/IonQuant IonQuant@noreply.github.com Cc: Nesvizhskii, Alexey nesvi@med.umich.edu; Mention mention@noreply.github.com Subject: Re: [Nesvilab/IonQuant] Accuracy (#3)
External Email - Use Caution
An alternative dataset to access accuracy might be PXD014777https://www.ebi.ac.uk/pride/archive/projects/PXD014777/private. It is a LFQbench style triple hybride proteome. Have you tried this one? It was used to benchmark MQ 1.6.6https://www.mcponline.org/content/early/2020/03/10/mcp.TIR119.001720
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/IonQuant/issues/3#issuecomment-617838856, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM67UPKELHVXKI2TUEXDRN4CGRANCNFSM4MNDQDJQ.
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Hi @anesvi, Hi @fcyu ,
I meanwhile managed to download and analyse the triple hybride data from PXD014777. This is what I see using a FragPipe->MSstats workflow followed by some R code for plotting the MSstats estimates (haven't used anything else):
log2FC distribution by kernel density estimator
MA plot incl. LOESS fit
Vulcano plot grouped by species
Best, Tobi
Thanks Tobi,
We are also looking at this data, and might find the reasons and solutions. Will let you know when we have significant progress.
Best,
Fengchao
Tobi We think there are issues with TimsTOF intensities that are hard to normalize. Do you know a similar Thermo data? Alexey
Sent from my iPhone
On Apr 28, 2020, at 10:13 AM, Fengchao notifications@github.com wrote:
External Email - Use Caution
Thanks Tobi,
We are also looking at this data, and might find the reasons and solutions. Will let you know when we have significant progress.
Best,
Fengchao
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/IonQuant/issues/3#issuecomment-620633885, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM62JNCEKRVJVP2DPXMDRO3QCJANCNFSM4MNDQDJQ.
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Hi @anesvi,
you mean a triple hybride proteome analyzed by LC-MS on an Orbitrap in DDA mode?
Best, Tobi
Yes
Sent from my iPhone
On Apr 28, 2020, at 10:22 AM, Tobias Kockmann notifications@github.com wrote:
External Email - Use Caution
Hi @anesvihttps://github.com/anesvi,
you mean a triple hybride proteome analyzed by LC-MS on an Orbitrap in DDA mode?
Best, Tobi
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/IonQuant/issues/3#issuecomment-620638944, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM63HGORPS623HKUK4XLRO3RB7ANCNFSM4MNDQDJQ.
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
We (FGCZ) did this to cross compare the HF-X with the timsTOF Pro. Unfortunately, we never got the raw data from the Bruker demo lab. But the HF-X data should be in our LIMS system.
If you can pass HF-X data to Fengchao it would be helpful
Sent from my iPhone
On Apr 28, 2020, at 10:40 AM, Tobias Kockmann notifications@github.com wrote:
External Email - Use Caution
We (FGCZ) did this to cross compare the HF-X with the timsTOF Pro. Unfortunately, we never got the raw data from the Bruker demo lab. But the HF-X data should be in our LIMS system.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/IonQuant/issues/3#issuecomment-620649723, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM6Z5HDFOGOYWLGAC5VDRO3TGTANCNFSM4MNDQDJQ.
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Hei @anesvi and @tobiasko, here is a recent publication that uses a similar dataset like the tripleProteome done on a Fusion instrument. https://www.mcponline.org/content/mcprot/early/2020/04/22/mcp.RA119.001624.full.pdf
The pride link is here: https://www.ebi.ac.uk/pride/archive/projects/PXD003881
Best regards- jonas
Hi @tobiasko ,
Thanks for your help in advance. You may directly contact Alexey (nesvi@med.umich.edu) and me (yufe@umich.edu) if you want to share the data with us.
Best,
Fengchao
Sure! Let me check if I can find it.
Hi @jjGG,
Thanks for your information. I am looking at it now.
Best,
Fengchao
Hi @tobiasko ,
Thank you very much for your checking and testing. We thoroughly invested it, and found some bugs and issues in our program. After updating it to 1.1.0, it shows a good accuracy and a better precision (lower median of CV (coefficient of variation)). Following is the result from the three species data (PXD014777):
Best,
Fengchao
Nice work! You should incl. that in your IonQuant manuscript! We have meanwhile finished our backend integration and IonQuant is really stable. So far no issues at all! Can't say this for MQ and PASEF data! Looks like you are in the lead.
Best, Tobi
Thanks Tobi.
We have included this experiments to the manuscript. Will be available after it publishes.
Best,
Fengchao
Dear IonQuant developers,
I cross compared the quantification results obtained by running
a) FragPipeGUI -> MSstats b) MQ -> MSstats
on a published PASEF dataset with a priori known sample ratios Meier et al. 2018 see Fig. 5d. In general, the results look really nice, BUT... my data suggests that IonQuant tends to systematically underestimate the expected fold changes (Ecoli 1:4, Hs: 1:1):
This becomes even clearer when analysing the residuals (estimated log2FC vs. expected). MQ residuals are pretty much centered on zero (as one would expect)
FragPipe residuals are shifted by half a log2 unit, too low in the mean.
I am now wondering if this effect could be explained by specific properties of the Meier et al. dataset, maybe in combination with parameter choices in FragPipe->MSstats? Your manuscript doesn't really touch the topic quantification accuracy and instead focusses on precision. Have you observed similar things when using ground truth datasets?
Greetings, Tobi