darcyj / specificity

R package for calculating specificity in ecological data
7 stars 1 forks source link

Source-specific taxa identification #1

Closed ivelsko closed 1 year ago

ivelsko commented 2 years ago

Hi John,

thanks for developing this tool, it's been a neat new way of exploring my data and the vignette was easy to follow. I was wondering if specificity be used in a source-tracking capacity? My samples were processed in different labs and they cluster by processing lab in beta-diversity analyses despite removing potential contaminants identified with decontam. Even after a lot of threshold testing with decontam.

So I was wondering if I could use specificity to identify taxa specific to particular labs, by converting each lab to a number to make it a "continuous" variable. It worked when I ran it this way and did identify a number of lab-specific taxa, but I could also use geographic location of the labs for input. However, is this a valid application of specificity?

Thanks, Irina

darcyj commented 1 year ago

Spec isn't really useful for source tracking; it's more useful for identifying features that may be good candidates for source tracking!

Spec isn't really useful for categorical data either, unless you have a lot of categories and well-characterized differences between them (then it's super useful). I'd use regular old differential-abundance calculations with your data. Perhaps use gamlss package with a zero-inflated negative binomial distribution, model feature ~ labsite or something. Good luck!