rickhelmus / patRoon

Workflow solutions for mass-spectrometry based non-target analysis.
https://rickhelmus.github.io/patRoon/
GNU General Public License v3.0
58 stars 17 forks source link

Possibility for feature group filtering based on replicate groups and detection frequencies? #89

Closed bhc928 closed 11 months ago

bhc928 commented 11 months ago

Hi there,

I am relatively new to PatRoon and am thinking of additional ways to reduce the number of feature groups prior to annotation, besides e.g., high intensity limits, only analyzing portions of the chromatogram or mass ranges at a time etc. Is it possible to use the "filter()" function with fGroups to remove feature groups from specific replicate groups - e.g., remove all features present in the blanks that are not present in the samples (i.e., features unique to the blanks only), without removing the blanks completely.

I have attached a plotUpSet visualization, which easily shows the unique (or not) combinations of feature groups for different replicate groups. You can see that my blanks; GFblank (1049 unique feature groups), SFblank (396), and RFblank (315) have a number of unique features not present in the other replicate sample groups. Removing those could reduce my feature groups by close to 50%.

20230801_PatRoon_POS_Feature group_Replicate treatments.pdf

Any ideas/suggestions? I think I could manually extract/rearrange/filter the replicates from fGroups using traditional R packages (e.g., tidyverse), but I am not confident that I could then get it back into an fGroups format that would function correctly in the subsequent patRoon annotation steps, if that makes sense... But before I try, I just wanted to check I am not missing something obvious.

Many thanks, Cleo

rickhelmus commented 11 months ago

Hello,

Certainly! There are quite some approaches to delete or select feature groups of interest. The first step is usually to get the names of the feature groups you want to keep or delete.

For instance, in your example were you want to remove all feature groups unique to blanks:

# get a subset of the feature groups that are unique to the blank replicates
fGroupsBlanksUn <- unique(fGroups, which = "blank")
# delete the 'blank features' from the original feature groups
fGroups <- delete(fGroups, j = names(fGroupsBlanksUn))

For more advanced data processing you could first convert the feature group data to a table with as.data.frame()/as.data.table(), filter the table and use the remaining feature group names to subset/delete, e.g.

tab <- as.data.table(fGroups)
# only rows with higher m/z and retention times
tab <- tab[mz > 300 & ret > 200]
fGroupsSub <- fGroups[, tab$group] # subset (ie keep only these)
fGroupsDel <- delete(fGroups, j =  tab$group) # delete (ie remove just these)

More examples are in the handbook ;-)

HTH! Rick

bhc928 commented 11 months ago

Hi Rick,

Thanks so much for the quick reply. This makes perfect sense and my apologies for not catching the right section in the manual (I have been scouring the filtering steps/functions...) - unique() and delete() are exactly what I am looking for!

Thanks for the help and have a great day, Cleo