about Ionquant normalinzation

juseonmin commented 1 year ago

Hi, I'm using Fragpipe's Ionquant option to compare timsTOF data to orbitrap data, and I had a question that prompted me to write.

First of all, thank you so much for developing such a useful tool!

To summarize the situation with my data analysis, First, I loaded about 1ng, 25ng, and 100ng of HeLa cell QC and got data files from timsTOF and Orbitrap.

I wanted to use the Ionquant option to search the data for a quantitative comparison, but the strange thing is that in the combined_protein.tsv of the results obtained with Orbitrap, the quantitative trend shown by the spectral counts is reversed as the intensities change. The intensities and the MaxLFQ are the same; the data obtained with timsTOF is not.

Furthermore, the protein.tsv of the file generated for each raw file shows the same trend in chromatograms and spectral counts, that the amount of sample increases with the amount of sample loaded, but in the combined_protein.tsv, the results are reversed. So, after playing around, I noticed that if I uncheck ionquant>common>Normalize intensity across runs, the results in protein.tsv are mirrored. I thought that option was to normalize the quantitative changes, but now that I unchecked it, it's reflecting the trend, which is confusing.

Here is the numerical data

The timsTOF is instrumental saturation, meaning that the 25ng and 100ng intensity results are similar.

I suspect it's because I loaded a very small amount of samples and the quantitative differences are so large that normalization was difficult, but I'm not sure why specifically, so I'm asking.

Thank you very much.

fcyu commented 1 year ago

Let me try to address them one-by-one.

First, I loaded about 1ng, 25ng, and 100ng of HeLa cell QC and got data files from timsTOF and Orbitrap.

MaxLFQ will bring the intensities to the same level for those three experiments. So, maybe you should run them separately.

I wanted to use the Ionquant option to search the data for a quantitative comparison, but the strange thing is that in the combined_protein.tsv of the results obtained with Orbitrap, the quantitative trend shown by the spectral counts is reversed as the intensities change. The intensities and the MaxLFQ are the same; the data obtained with timsTOF is not.

I am not sure of the reason. In my opinion, spectral count normally is not as accurate as MS1-based intensity. I personally never checked the relationship between spectral count and intensity. Furthermore, if you enable MBR, the relationship between spectral count and protein intensity is more "complicated", I think.

Furthermore, the protein.tsv of the file generated for each raw file shows the same trend in chromatograms and spectral counts, that the amount of sample increases with the amount of sample loaded, but in the combined_protein.tsv, the results are reversed. So, after playing around, I noticed that if I uncheck ionquant>common>Normalize intensity across runs, the results in protein.tsv are mirrored. I thought that option was to normalize the quantitative changes, but now that I unchecked it, it's reflecting the trend, which is confusing.

The normalization does not affect protein.tsv. It only affect combined_*.tsv files. So, I guess you messed up something.

I suspect it's because I loaded a very small amount of samples and the quantitative differences are so large that normalization was difficult, but I'm not sure why specifically, so I'm asking.

This may be one of the explanations. The assumption for normalization and MaxLFQ is that most of the peptides/proteins have similar intensities among injections. Although we have a robust normalization algorithm, but it still needs a reasonable amount of peptides having similar intensities. Maybe you can run your three experiments separately and see what would happen.

Best,

Fengchao

juseonmin commented 1 year ago

Thank you very much for your kind response.

I will try to do what you said.

Again, thank you so much for developing such a handy tool.

Keep up the good day!!

dong-gi-jang commented 1 year ago

Hi there. I am also using FragPipe-IonQuant to analyze samples with different peptide amounts together. I want to analyze the DEPs between samples using the combined_*.tsv file. However, like the questioner above, I encounter dramatically different results depending on whether the 'normalize intensity across runs' option is enabled or disabled.

Can you tell me what normalization is done with 'normalize intensity across runs'?

Thank you.

fcyu commented 1 year ago

Hi @dong-gi-jang ,

Here (https://www.mcponline.org/article/S1535-9476(20)35104-5/fulltext#:~:text=we%20developed%20a%20piecewise%20normalization%20algorithm%20in%20IonQuant.) is the brief description about how normalization is performed.

Could you elaborate a little bit more about "I encounter dramatically different results depending on whether the 'normalize intensity across runs' option is enabled or disabled."? Maybe you can send us some data. I will give you a link to upload if you need one

Best,

Fengchao

dong-gi-jang commented 1 year ago

Hi Fengchao,

Thank you for your fast response. Now I have a clearer understanding of how my data is being processed.

Actually I performed MS analysis without prior quantitative assays, such as BCA, due to the low amount of peptides remaining after peptide enrichment experiment. So I analyzed the data knowing that the total amount may vary depending on the sample conditions and that they are heterogenous.

When I used the built-in LFQ workflows, the "normalize intensity across runs" option was enabled. However, I noticed that the trend of sum of PSM counts, which I consider semi-quantitative, differs from that of intensity in my combined_protein.tsv. The samples that were expected to be the least abundant based on biological conditions were conversely assigned the greatest intensity. (similar to @juseonmin's situation.) I think It may have been my fault for not taking into account the assumption that most of the peptides should have similar intensities.

Here is the sum of the intensities and the sum of the MaxLFQ intensities with and without the "normalize intensity across runs" option.

I can also upload my .tsv files if needed.

Thank you, Donggi Jang

fcyu commented 1 year ago

Hi Donggi,

I personally never compared spectral counts with MS1-based intensities. When MBR is enabled, some of the MS1-based intensities are from ion transferring, which makes such comparison less meaningful. I don't know what conclusion I can get from your figures.

Best,

Fengchao

Nesvilab / IonQuant

about Ionquant normalinzation #38