adw96 / breakaway

Species richness with high diversity
68 stars 18 forks source link

Can this package be used for accurate calculation of alpha diversity in meta-analysis? #189

Closed hlk-slowlearner closed 1 year ago

hlk-slowlearner commented 1 year ago

Hello, I would like to enquire if the method for estimating alpha diversity in breakaway can be applied to microbial-related meta-analysis studies, such as the comparison of endophytic microbial communities of multiple plants in different environments. As meta-analyses tend to face a high degree of heterogeneity, I personally don't think it's a good idea to specify sequencing depth to a specific value in a meta-analysis, and it's a headache to effectively go about determining a more realistic alpha diversity, and breakaway strikes me as seeming to have the potential for this! But at the same time, I note that its estimation method draws on meta-analysis. At my current level of statistics, it seems difficult to determine whether this package is really suitable for meta-analysis studies, and how to use it appropriately (e.g. after estimating the breakaway richness of each control and experimental group sample in a study, how to convert them to the total effect size of the study when each richness has an ESTIMATE and ERROR but the sample size is only 1)

Looking forward to your response!

adw96 commented 1 year ago

Hello and thank you for your interest in breakaway. Indeed this package is suitable for a meta analysis. My recommendation is to use the function betta, which is a regression model for alpha diversity. You would put the alpha diversity estimates from each study as the chats argument and the standard errors on each alpha diversity estimate as the ses argument. You can then use formula or X to specify the regression model that you want to use in your metaanalysis. Good luck and I hope you find this helpful!

hlk-slowlearner commented 1 year ago

已收到您的来信,会尽快回复

hlk-slowlearner commented 1 year ago

Hello and thank you for your interest in breakaway. Indeed this package is suitable for a meta analysis. My recommendation is to use the function betta, which is a regression model for alpha diversity. You would put the alpha diversity estimates from each study as the chats argument and the standard errors on each alpha diversity estimate as the ses argument. You can then use formula or X to specify the regression model that you want to use in your metaanalysis. Good luck and I hope you find this helpful!

Thank you very much for your affirmation and advice. It saved me a lot of headaches!

hlk-slowlearner commented 1 year ago

Dear Amy @adw96 , I tried to use breakaway in my data after getting your confirmation. Unfortunately an error and a warning were reported and I do not know how to solve the problem. I have around 1300 samples from 28 different studies (their very different sequencing depths are one of the reasons I wanted to use breakaway) and my main run of R code is shown below: `> richness=ps %>%breakaway #ps is a phyloseq object Warning:Warning message in poisson_model(input_data, cutoff = cutoff): “Cut-off was too low: no data available for estimation” Error:Warning message in poisson_model(input_data, cutoff = cutoff): “Cut-off was too low: no data available for estimation” Error:Warning message in poisson_model(input_data, cutoff = cutoff): “Cut-off was too low: no data available for estimation” .........

meta <- sample_data(ps)%>% as_tibble %>%mutate("sample_names" = ps %>% sample_names ) combined_richness <- meta %>% left_join(summary(richness), by = "sample_names") bt_day_fixed <- betta(chats = combined_richness$estimate, ses = combined_richness$error, X = model.matrix(~study_group-1, data = combined_richness)) Error:Error in betta(chats = combined_richness$estimate, ses = combined_richness$error, : The starting value and 200 perturbations were not enough to find a maximum likelihood solution. Please try again with a new choice of initial_est. Traceback:

  1. betta(chats = combined_richness$estimate, ses = combined_richness$error, . X = model.matrix(~study_group - 1, data = combined_richness))
  2. stop(paste("The starting value and 200 perturbations were not", . "enough to find a maximum likelihood solution.", "Please try again with a new choice of initial_est.")) ` I have looked at some other issues with similar warnings and it seems that this low standard deviation of richness results in acceptable results. But then after running the results through the betta function, this error message was what confused me, and I had suspected that this might be because the data had multiple standard errors of 0, but wasn't sure. Do you have any solution to this error message?