Open wurth13 opened 7 years ago
Suppose we have a phyloseq object ps
.
Let's say that the unclassified
genus in the taxonomy table is replaced by NA
(missing value.)
We can rename the genus level with the highest known level of taxonomy using the following function
.
# taxonomy table
tax.tab <- data.frame(tax_table(ps))
ModifyTax <- function(x,ind){
# xth row in the dataframe
# ind taxonomy level to change
if(is.na(x[ind])){
nonNa <- which(!is.na(x[-ind]))
maxNonNa <- max(nonNa)
x[ind] <- paste(x[maxNonNa],".",x[ind])
}else{x[ind] <- x[ind]}
}
# replace the genus taxonomy with the highest known taxonomy
tax_table(ps)[,6] <- apply(tax.tab,1,ModifyTax,ind=6)
In the above ModifyTax
function, we add x[ind]
to show that the genus level is unclassified.
You can take a look at the way I construct taxonomy tables in RDPutils (GitHub, jfq3/RDPutils). For every rank, not just genus, the taxon name begins with "uncl" or "unclassified" what ever rank the OTU is classified to. This is based on confidence provided by the RDP classifier, utax, or sintax. JQ
I have figured out how to export relative abundance tables according to their taxonomic rank, but am struggling with several "unclassified" which I would prefer to be able to classify at the next highest level. So for genus, I would like to find that same OTU group so that I can report it's family level assignment. Is this possible?
Thank you!!