Minimum number of peptides for LFQ

Hi, I think the difference might be due to the modified peptides. For the data in #7, if I remove all the modifications, I get the same result as the previous script.

process_long_format("DIA-Report-long-format.txt",
                    output_filename = "iq-MaxLFQ.tsv", 
                    sample_id  = "R.FileName",
                    primary_id = "PG.ProteinGroups",
                    secondary_id = c("EG.Library", "FG.Id", "FG.Charge", "F.FrgIon", "F.Charge", "F.FrgLossType"),
                    intensity_col = "F.PeakArea",
                    annotation_col = c("PG.Genes", "PG.ProteinNames", "PG.FastaFiles"),
                    filter_string_equal = c("F.ExcludedFromQuantification" = "False"),
                    filter_double_less = c("PG.Qvalue" = "0.01", "EG.Qvalue" = "0.01"),
                    peptide_extractor = function(x) gsub("\\[Oxidation \\(M\\)\\]", "", 
                                                         gsub("\\[Carbamidomethyl \\(C\\)\\]", "", 
                                                              gsub("\\[Acetyl \\(Protein N-term\\)\\]", "",
                                                                   gsub("_.[0-9].*$", "", x)))),
                    log2_intensity_cutoff = -1000)

The regex in peptide_extractor is a bit messy. Basically, we want entries such as "[Acetyl (Protein N-term)]M[Oxidation (M)]EDMNEYSNIEEFAEGSK_.2" be reduced to "MEDMNEYSNIEEFAEGSK" to count the number of unique peptides.

tvpham / iq

Minimum number of peptides for LFQ #10