MadsAlbertsen / mmgenome

Please use mmgenome2 instead. Tools for extracting individual genomes from metagenomes
https://kasperskytte.github.io/mmgenome2/
27 stars 8 forks source link

Not a problem, but a couple of feature request #18

Closed nmb85 closed 9 years ago

nmb85 commented 9 years ago

Hi, Mads,

Great tool; much thanks! Could you add a function to segregate out by pps taxon (or marker gene taxon), such as:

function(df, name, level, omit = F) { name <- paste(toupper(substring(name,1,1)), substring(name,2,), sep = "") level <- tolower(level) new_df <- list(scaffolds = data.frame(), essential = data.frame()) if (omit == F) { eval(parse(text = paste("newdf$scaffolds <- df$scaffolds[df$scaffolds$pps", level, " == \"", name,"\", ]", sep = ""))) } else if (omit == T) { eval(parse(text = paste("newdf$scaffolds <- df$scaffolds[df$scaffolds$pps", level, " != \"", name,"\", ]", sep = ""))) } else { message("The \"omit\" value should be either True or False") stop() } new_df$scaffolds <- new_df$scaffolds[is.na(new_df$scaffolds$scaffold) == F, ] new_df$essential <- df$essential[df$essential$scaffold %in% new_df$scaffolds$scaffold, ] return(new_df) }

Also, could you add a function to slice the list of dataframes by coverage?

I find myself doing both things quite frequently...

MadsAlbertsen commented 9 years ago

Thanks for the good suggestions - I'll look into it.

MadsAlbertsen commented 9 years ago

Reopened to make sure I remember it.

MadsAlbertsen commented 9 years ago

I've added a generic mmsubset function (It simply wraps the default "subset" function in R). It should be included in the 0.5.0 version.

It should be able to subset based on the function you describe - if I understood it correctly.

data(rocco)
d_gc60 <- mmsubset(data = d, gc > 60)
d_Actinobacteria <- mmsubset(data = d, pps_phylum == "Actinobacteria")
nmb85 commented 9 years ago

Looks fantastic! Very useful; thank you!