jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
373 stars 80 forks source link

Inquiry about extracting all the taxa associated with a particular functional database #682

Closed afsanarupa closed 1 year ago

afsanarupa commented 1 year ago

Hello, I have run a SqueezeMeta project in coassembly mode. I have used "CAZy database" as "CAZy_db" as external database. Now I want to extract the taxa that has been been annotated with any CAZy database function. For this I tried: Cazy_annot = subsetFun(met_jute, fun = c("AA","CBM","GH", "GT","PL"), rescale_copy_number = F ) #CAZy_db has only these 5 annotated functions Cazy_annot_taxa = Cazy_annot$taxa$genus$percent exportTable(Cazy_annot_taxa, "Cazy_annot_taxa.tsv")

However, Im not sure, whether I got all the taxa associated with Cazy_db annotation, is there a better way to extract the taxa associated with a particular fun_level (eg: kegg or extdb like CAZy_db)

fpusan commented 1 year ago

No, the fun argument in subsetFun should be a regex string, not a vector.

As a general approach you can use the following.

fun_level = "KEGG" # for example
fun_column = "KEGG ID" # check the colnames of SQM$orfs$table to get the right column name
all_annots = rownames(SQM$functions[[fun_level]]$abund)
pattern = paste(all_annots, collapse="|")
SQM.fun.annot = subsetFun(SQM, pattern, columns = fun_column)
SQM.fun.annot.genus.percent = SQM.fun.annot$genus$percent
afsanarupa commented 1 year ago

Thanks fpusan for the reply. However, when I change this command for my external database "CAZydb". I get nothing in genus$percent, there is no object in SQM.fun.annot as genus or taxa. here is my code: fun_level = "CAZydb" # for example fun_column = "CAZydb" # check the colnames of SQM$orfs$table to get the right column name all_annots = rownames(met_jute$functions[[fun_level]]$abund) pattern = paste(all_annots, collapse="|") SQM.fun.annot = subsetFun(met_jute, pattern, columns = fun_column) SQM.fun.annot.genus.percent = SQM.fun.annot$genus$percent

here is the output I get when I type :

SQM.fun.annot.genus.percent Null

fpusan commented 1 year ago

My bad! Try SQM.fun.annot.genus.percent = SQM.fun.annot$taxa$genus$percent

afsanarupa commented 1 year ago

Thanks, it is solved.