ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
195 stars 56 forks source link

taxa nested bug? #341

Closed jamorillo closed 5 months ago

jamorillo commented 6 months ago

Dear Chi, I think I found a bug when I try to produce taxa plots with nested structure. From microeco tutorial: (R.version: 4.3.3.)

packageVersion("microeco") [1] ‘1.5.0’

test1 <- trans_abund$new(dataset = dataset, taxrank = "Class", ntaxa = 10, high_level = "Phylum", prefix = "|") Add higher taxonomic level into the table ... The transformed abundance data is stored in object$data_abund ...

test1$plot_bar(ggnested = TRUE, facet = c("Group", "Type"), xtext_angle = 30) Error in data.frame(group = groups, group_colour = clr_pal[1:n_clrs]) : arguments imply differing number of rows: 0, 1

I get the same error with my data. Thanks!

ChiLiubio commented 6 months ago

Hi @jamorillo Thanks. I have the same configuration and run the steps well. I guess the issue comes from the compatability between ggplot2 and other dependent packages. When I only update ggplot2 package last time, I also get such error. Then I update all the packages and everything goes well. If the error is still there, you can reinstall R and all the packages to ensure the success. 图片

jamorillo commented 6 months ago

My R (4.3.3.) and Microeco version (1.5.0) are the same, and I installed them in a new computer, only few days ago. But also, I installed more libraries later so perhaps one of the other packages alter something. Will have a look more in detail and reinstall packages thank you!

ChiLiubio commented 6 months ago

Ok. Please feel free to tell me if it is still there. I was thinking that if R could have a specific environment like conda in Linux, we wouldn't have such issue.

srisvs33 commented 6 months ago

@jamorillo Did you manage to solve the issue? I am also having a similar issue. I have even completed removing all the packages and re-installed everything again. But still unable to solve the problem. Any help would be much appreciated.

Out of the three examples proposed in the tutorial, I was able to reproduce the figure using # fixed number in each phylum. However, the other two codes were not working.

require ggnested package; see https://chiliubio.github.io/microeco_tutorial/intro.html#dependence

test1 <- trans_abund$new(dataset = dataset, taxrank = "Class", ntaxa = 10, high_level = "Phylum", prefix = "\|") test1$plot_bar(ggnested = TRUE, facet = c("Group", "Type"), xtext_angle = 30)

fixed number in each phylum

test1 <- trans_abund$new(dataset = dataset, taxrank = "Class", ntaxa = 30, show = 0, high_level = "Phylum", high_level_fix_nsub = 4) test1$plot_bar(ggnested = TRUE, xtext_angle = 30, facet = c("Group", "Type"))

sum others in each phylum

test1 <- trans_abund$new(dataset = dataset, taxrank = "Class", ntaxa = 20, show = 0, high_level = "Phylum", high_level_fix_nsub = 3, prefix = "\|") test1$plot_bar(ggnested = TRUE, high_level_add_other = TRUE, xtext_angle = 30, facet = c("Group", "Type"))

I am also keeping in loop @ChiLiubio

Any help would be very much appreciated.

thanks and regards Venkat

ChiLiubio commented 6 months ago

Hi. Venkat. Do you use the example dataset? If so, prefix should be "|". I reinstalled R and all the packages, and successfully run those steps.

library(microeco)
library(ggplot2)

test1 <- trans_abund$new(dataset = dataset, taxrank = "Class", ntaxa = 10, high_level = "Phylum", prefix = "|")
test1$plot_bar(ggnested = TRUE, facet = c("Group", "Type"), xtext_angle = 30)

# fixed number in each phylum

test1 <- trans_abund$new(dataset = dataset, taxrank = "Class", ntaxa = 30, show = 0, high_level = "Phylum", high_level_fix_nsub = 4)
test1$plot_bar(ggnested = TRUE, xtext_angle = 30, facet = c("Group", "Type"))

# sum others in each phylum

test1 <- trans_abund$new(dataset = dataset, taxrank = "Class", ntaxa = 20, show = 0, high_level = "Phylum", high_level_fix_nsub = 3, prefix = "|")
test1$plot_bar(ggnested = TRUE, high_level_add_other = TRUE, xtext_angle = 30, facet = c("Group", "Type"))
srisvs33 commented 6 months ago

Dear @ChiLiubio

I now confirm that your fix (prefix should be "|") has worked. Thank you very much for your assistance and for developing this wonderful package.

Regards Venkat

jamorillo commented 6 months ago

Sorry for the delay replaying this. Excellent, with the prefix is also working for me. Thank you! Regards, jose

milyzhou commented 5 months ago

Hi Chi, I want to created the hierarchical abundance data of two levels ("Superclass1" and "Phylum"), and it worked. but the results confused me. the relative abundance (%)>100%.

1

here it's my code, test$tidy_dataset() print(test) test$cal_abund(select_cols = c("Superclass1", "Phylum", "Genus"), rel = TRUE) test1 <- trans_abund$new(test, taxrank = "Phylum", ntaxa = 10, delete_taxonomy_lineage = T) test1$plot_bar(facet = "group") test1 <- trans_abund$new(dataset = test, taxrank = "Phylum", high_level = "Superclass1", prefix = "|")

Add higher taxonomic level into the table ...

The transformed abundance data is stored in object$data_abund ...

test1$plot_bar(ggnested = TRUE, facet = c("group", "time"))

ChiLiubio commented 5 months ago

Hi. Please attach your test that I can reproduce and test how it happend. To save it , please follow the tutorila (https://chiliubio.github.io/microeco_tutorial/notes.html#save-function) and attach the compressed file.

milyzhou commented 5 months ago

test_data.zip here is my test and Thanks again.

ChiLiubio commented 5 months ago

Hi. I've realized an issue. The original bar nested method is intended for taxonomic data, such as 16S, with strict hierarchical relationships. However, in this mixed case, there is no strict correspondence between the various superclasses and phyla. Different superclasses contains same phyla, leading to unexpected and erroneous results. For instance, many bars in the figure are duplicates, and if you view the intermediate file using the code View(test1$data_abund), you'll see that the data is also duplicated. Though it is not designed for such data, we can solve it with some tricks. First, we should make superclasses and phyla to be strictly corresponded. The way is to add some unique thing to Phylum

library(magrittr)
library(microeco)
load("dataset.RData")
d1 <- clone(test)
d1$tax_table$Phylum <- paste0(d1$tax_table$Phylum, ": ", d1$tax_table$Superclass1)
d1$cal_abund(select_cols = c("Superclass1", "Phylum", "Genus"), rel = TRUE)

Plot the results. Seems to be fine.

test1 <- trans_abund$new(dataset = d1, taxrank = "Phylum", high_level = "Superclass1", prefix = "|")
test1$plot_bar(ggnested = TRUE, facet = c("group", "time"))

Then, we need to delete the thing that we added. Need to manipulate the intermediate data.

# delete added things
test1$data_abund$Taxonomy %<>% gsub(": .*", "", .)
View(test1$data_abund)
# data_taxanames should also be modified
test1$data_taxanames <- c("p__Firmicutes", "p__Proteobacteria", "p__Actinobacteria")
test1$plot_bar(ggnested = TRUE, facet = c("group", "time"))
milyzhou commented 5 months ago

Thanks again. It worked.