joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
579 stars 188 forks source link

Tip_glom Complete Linkage Clustering #1017

Open ch16S opened 5 years ago

ch16S commented 5 years ago

Hi Joey,

Firstly, fantastic job on phyloseq.

In the documentation for tip_glom it states future releases would have extended clustering capabilities for hcfun. I find tip_glom to be very useful so I was curious to know if that functionality would come anytime soon?

Regards Chris

joey711 commented 5 years ago

Thanks @ch16S I wasn't planning any, as you're the first user to ask for this in a long time. I suspect it would not be that difficult of a feature extension to add to the current function, but I haven't looked in a while. This is taking it down into the "OTU clustering" re-implementation, for which I think the utility is limited/dangerous (frankly). Nevertheless, you might have a good reason to want to simplify the data in this way, so I'm happy to entertain the notion a little bit longer. I'll mark as a feature request and consider when I'm thinking about next features to include in a next release.

ch16S commented 5 years ago

Hi Joey,

Thanks for considering it, I know it's not ideal. I found the tip_glom useful for environmental samples where taxonomic identification isn't great, and the samples have a high number of sequence variants (i.e. soil). The agglomeration helps in performing certain time consuming analysis.

Cheers, Chris