kassambara / rstatix

Pipe-friendly Framework for Basic Statistical Tests in R
https://rpkgs.datanovia.com/rstatix/
440 stars 50 forks source link

```anova_test()``` Two-way repeated measures ANOVA results in NaN #138

Closed kwabbsel closed 2 years ago

kwabbsel commented 2 years ago

In my dataset, I have measured how an antibiotic treatment affects bacterial abundances at 7 different time points. My data looks like this:

Bildschirmfoto 2022-01-21 um 22 55 14

Now, I want to test the effect treatment:time by using a two-way repeated measures ANOVA (within subjects-design) with anova_test(). I've been following the tutorial on datanovia. I can retrieve all information as described in the tutorial (outliers, assumption for normality, qqPlot) but when running the actual anova_test() command, the output table shows NaN. I've tried to subset my data into a much smaller dataset but so far no luck. Can anyone help me with this issue?

##### Produce phyloseq object ######
ps <- qza_to_phyloseq(
  features="filtered_table_no-mitochondria_no-chloroplast.qza",
  tree="rooted-tree.qza",
  "taxonomy.qza",
  metadata = "metadata_combined.tsv")

ps <- subset_samples(ps, sample.material=="Swab")

##### Alpha Diversity #####
ps1 <- prune_taxa(taxa_sums(ps) > 0, ps) #removes all taxa with 0 sequences
tab <- microbiome::alpha(ps1, index = "all") #calculates alpha diversity

ps1_meta <- meta(ps1) # Extract sample data from ps

ps1_meta$Shannon <- tab$diversity_shannon #add shannon index

# Compute two-way rANOVA
res.aov <- anova_test(
  data = ps1_meta, dv = Shannon, wid = fish.id,
  within = c(treatment, time))

 get_anova_table(res.aov)
Bildschirmfoto 2022-01-21 um 23 59 30
kwabbsel commented 2 years ago

At first, I thought the issue is that some of my treatment and time measurements do not have an assigned fish.id (this is because I also took other samples (called water and dry_swab) from the same treatment at the same time but here I only want to look at the effect on the fish).

When I "clean" my dataset by removing these other samples

ps <- subset_samples(ps, sample.type !="Water")
ps <- subset_samples(ps, sample.type !="Dry_Swab")

and then try to re-run the ANOVA, it wants to tell me that one of my data frame columns contains only NA values. How could that be? The only thing that these two commands above caused was the removal of some (20-30) rows...

Bildschirmfoto 2022-01-22 um 00 06 52
mirh commented 1 year ago

Did you fix this? I'm kinda having the same issue with a mixed design.

ID  Happiness   Something   Score
1   13      1       20
1   13      2       7
2   10      1       11
2   10      2       18
3   23      1       20
3   23      2       4
4   15      1       14
4   15      2       20
5   25      1       17
5   25      2       27

anova_test(Score~Happiness*Something + Error(ID/Something), data=test, dv=Score, wid=ID, between=Happiness, within=Something)

And everything I can get is NaNs like above.

mirh commented 1 year ago

Did you fix this? I'm kinda having the same issue with a mixed design.

ID  Happiness   Something   Score
1   13      1       20
1   13      2       7
2   10      1       11
2   10      2       18
3   23      1       20
3   23      2       4
4   15      1       14
4   15      2       20
5   25      1       17
5   25      2       27

test <- data.frame( "ID" = c(1,1,2,2,3,3,4,4,5,5), "Happiness" = c(13,13,10,10,23,23,15,15,25,25), "Something" = c(1,2,1,2,1,2,1,2,1,2), "Score" = c(20,7,11,18,20,4,14,20,17,27) ) test$Something<-as.factor(test$Something) anova_test(Score~Happiness*Something + Error(ID/Something), data=test, dv=Score, wid=ID, between=Happiness, within=Something)

And everything I can get is NaNs like above.