cafferychen777 / ggpicrust2

Make Picrust2 Output Analysis and Visualization Easier
https://cafferychen777.github.io/ggpicrust2/
MIT License
102 stars 11 forks source link

pathway_errorbar(): "Error in guide_train.prism_offset_minor" when ggprism loaded #20

Closed erikpark closed 1 year ago

erikpark commented 1 year ago

Hi again, different issue this time!

Because the KEGG database annotation was taking so long for me, I decided to run through the workflow for the EC annotation. Much faster! But I am still running into some new errors at the pathway_errorbar() stage. Below is my code, and data so you can hopefully reproduce the issue. But first, my environment is as follows: package loadedversion ape ape 5.7-1 dplyr dplyr 1.1.2 forcats forcats 1.0.0 genefilter genefilter 1.80.3 ggpicrust2 ggpicrust2 1.6.2 ggplot2 ggplot2 3.4.2 ggprism ggprism 1.0.4 ggpubr ggpubr 0.6.0 HTSSIP HTSSIP 1.4.1 lattice lattice 0.21-8 lubridate lubridate 1.9.2 patchwork patchwork 1.1.2 permute permute 0.9-7 phyloseq phyloseq 1.42.0 purrr purrr 1.0.1 readr readr 2.1.4 stringr stringr 1.5.0 tibble tibble 3.2.1 tidyr tidyr 1.3.0 tidyverse tidyverse 2.0.0 vegan vegan 2.6-4

Now my code:

library(phyloseq)
library(ggplot2)
library(ape)
library(vegan)
library(ggpubr)
library(tidyverse)
library(genefilter)
library(HTSSIP)
library(ggprism)
library(patchwork)
library(ggpicrust2)

metadata <- read_table("../Input files/ADWMBAT Combined Metadata.txt")

metadata <- metadata %>%
  filter(sex == "F" & all_data == "Y") %>%
  select(!X8) %>%
  mutate(genotype = as.factor(genotype))
# read in sample meta data as a tibble, and set genotype as a factor

ko_abundance <-
  read.delim(
    "../Data/For PICRUSt/picrust2_out_pipeline/EC_metagenome_out/pred_metagenome_unstrat.tsv"
  )
# load the EC count data in

rownames(ko_abundance) <- ko_abundance$function.
ko_abundance <- ko_abundance[, -1]
# remove the first column of function names by setting them as the row names. This is done because it's the format the next step, pathway_daa(), expects to see.

ko_abundance <- ko_abundance %>%
  rename('24' = X24,
         '26' = X26)
# rename some column names to match metadata file

daa_results_df <-
  pathway_daa(
    abundance = ko_abundance,
    metadata = metadata,
    group = "genotype",
    daa_method = "LinDA",
    select = NULL,
    reference = "WT",
    p.adjust = "none"
  )
# run the differential abundance calcualtion step using LinDA and no p-value adjustment

daa_annotated_sub_method_results_df <-
  pathway_annotation(pathway = "EC",
                     daa_results_df = daa_results_df,
                     ko_to_kegg = FALSE)

# select top 15 differentially expressed features 
# this done because there were too many "significant" without using FDR correction
daa_annotated_sub_method_results_df_filtered <- daa_annotated_sub_method_results_df %>%
  arrange(p_adjust) %>%
  slice_head(n = 15)

ko_abundance_plot <- ko_abundance %>%
  filter(rownames(ko_abundance) %in% daa_annotated_sub_method_results_df_filtered$feature)
# above was attempted to see if the ko_abundance object needed to be the same dimensions as the daa_results. didn't help.

daa_results_list <-
  pathway_errorbar(
    abundance = ko_abundance,
    daa_results_df = daa_annotated_sub_method_results_df_filtered,
    Group = "genotype",
    p_values_threshold = 0.05,
    order = "group",
    select = NULL,
    ko_to_kegg = FALSE,
    p_value_bar = TRUE,
    colors = NULL,
    x_lab = "description"
  )

print(daa_results_list)

For me executing all this code produces this error: "Error in guide_train.prism_offset_minor(guide, panel_params[[aesthetic]]) : No minor breaks exist, guide_prism_offset_minor needs minor breaks to work In addition: Warning messages: 1: Removed 15 rows containing missing values (geom_bar()). 2: Removed 15 rows containing missing values (geom_stripped_cols()). "

Trying to view() the daa_results_list object issues the following error: "Error: Index out of bounds"

Here are my files: pred_metagenome_unstrat.tsv.gz

ADWMBAT Combined Metadata.txt

Thanks for any help you can provide!

erikpark commented 1 year ago

Quick note that I added the following code above pathway_errorbar() just in case the group had the be hard coded as an object:

Group <- metadata$genotype

pathway_errorbar(
    abundance = ko_abundance,
    daa_results_df = daa_annotated_sub_method_results_df_filtered,
    Group = Group,
    p_values_threshold = 0.05,
    order = "group",
    select = NULL,
    ko_to_kegg = FALSE,
    p_value_bar = TRUE,
    colors = NULL,
    x_lab = "description"
  )

And now I get the following error: "Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : polygon edge not found"

Progress?

erikpark commented 1 year ago

Possibly found the issue, though I'll leave this open until you have a chance to verify.

I think it was related to me not having some required fonts installed. I followed the steps suggested on this stackoverflow thread: https://stackoverflow.com/questions/71362738/r-error-in-grid-callc-textbounds-as-graphicsannotxlabel-xx-xy-polygo

And now I can generate the barplots! If you agree that missing fonts might have been the issue, then it might be good to require them when the package installs.

cafferychen777 commented 1 year ago

Hello,

Thank you for reaching out and sharing your findings regarding the issue you encountered with our package. As the developer, I appreciate your effort in troubleshooting and finding a possible solution.

I will make sure to add this information to our package's Q&A repository, so that other users who might run into similar issues in the future can find a solution more easily. Thank you for bringing this to our attention.

Once again, if you have any further questions or concerns, please don't hesitate to reach out. We are always looking for ways to improve our package and provide a better user experience.

Best regards,

Chen YANG