Closed Clovernana closed 3 years ago
Hi Clover,
Yes, this is possible and quite easy. If "df" is the name of the data frame that contains the DIA-NN report in R, then:
data <- unique(df[,c('Genes','Stripped.Sequence')])
t <- table(data$Genes)
And "t" now contains the numbers of peptides matched to the protein (in this case I used the "Genes" column for protein Ids). Can do it for the whole report, or can do it also for each run separately.
Best wishes,
Vadim
OK, thank you for your reply. And I 'll strenghten my R basics. By the way, is filtering out the proteins with only one peptide a requisite for DIA proteomics? I have noticed that this statement is not clear in many articles.
No, in many cases it's perfectly fine to use proteins identified (and quantified) with a single peptide. But of course if you have a protein quantified with, say, 5 peptides, and all these show the same differential regulation pattern between conditions, this does give extra confidence.
OK,thank you! So is it controversial that I use those proteins with only one peptide when I perform a differential analysis? I'm not sure about this.
I don't see any problem with using a single peptide. It all depends on how you interpret the results. Basically, if you then report a list of proteins differentially regulated at 5% FDR (i.e. <5% of these are not really differentially regulated) - then it's fine, if at 0.1% FDR (i.e. you'd like to claim that only 1 out of 1000 proteins reported is not really differentially regulated) - not really.
OK. I get it.
Hi, Vadim
I want to ask you for help. I want to filter out the proteins identified with only one peptide. How could I finish this in the "report.tsv" file? I really didn't figure it out. Maybe some codes in R could help? For example??
one