Closed yangxinzhi closed 8 months ago
I wanted to ask the difference between the sample_Intensity and sample_MaxLFQ Intensity in the result file. Which one should I use for statistical analysis (which one is the result after normalization).
The sample_Intensity
is from the top-N peptides, and the sample_MaxLFQ
is from the MaxLFQ algorithm. Both are normalized. In most cases, you should use sample_MaxLFQ
.
Then when I select the right data, I need Keep each protein needs to be expressed in more than half of the samples. After filtering is complete, log is taken and random forest fill is performed for missing values. This is the step of my data analysis. I would like to ask if this analysis is OK?
We actually have FragPipe-Analyst for the downstream analysis: http://fragpipe-analyst.nesvilab.org/ It can take the combined_protein.tsv
from the LFQ-MBR workflow and perform routine analysis.
Best,
Fengchao
Thank you very much for that nice answer. I'll try it rightly!
By the way, I remember that I have another small problem. I also searched the database with PD for this data (480 was used to collect 90min of LFQ), but I found that for the same data, I could get about 4000 proteins by searching the database with PD. However, after fragpipe filtration is completed (using MaxLFQ), only about 2000 protein can be obtained. [log_2023-11-18_15-35-45.txt](https://github.com/Nesvilab/FragPipe/files/13405603/log_2023-11-18_15-35-45.txt)
There are 3360 proteins identified: INFO[15:15:24] Converged to 0.98 % FDR with 3360 Proteins decoy=33 threshold=0.9787 total=3393
. The proteins were filtered with global 1% PSM-level and protein-level FDR. When perform MaxLFQ, the filter is quite stringent. Not all proteins have quant value.
As to PD, the default FDR is 5%, not 1% unless you changed the settings.
Best,
Fengchao
You mean because fragpipe's LFQ template uses 0.01 for FDR correction, I would like to ask is it the --sequential --prot 0.01 in my screenshot here or MBM 0.01 in Q
uant?
Yes, also the MaxLFQ min ions
(it affects the number of non-zero Protein intensities a lot), min scans
, and min isotopes
.
Best,
Fengchao
Sorry, I might have a problem with what I said above. I would like to ask if I want to change the FDR to 0.05, do I need to change the MBM ion FDR under the Quant module to 0.05? Or do I just need to change --sequential --prot 0.01 to 0.05?
I don't think you should change the FDR threshold. What PD uses is too liberal. But you can change the MaxLFQ min ions
to 1.
Best,
Fengchao
Hello, when I searched the database using the default LFQ template with fragpipe and got the result of the combined protein, I wanted to ask the difference between the sample_Intensity and sample_MaxLFQ Intensity in the result file. Which one should I use for statistical analysis (which one is the result after normalization). Then when I select the right data, I need Keep each protein needs to be expressed in more than half of the samples. After filtering is complete, log is taken and random forest fill is performed for missing values. This is the step of my data analysis. I would like to ask if this analysis is OK?
(If a log file hasn't been generated, go to the 'Run' tab in FragPipe, click 'Export Log', zip the resulting "log_[date_time].txt" file to avoid truncation, then attach the zipped file by drag & drop here.)