cafferychen777 / ggpicrust2

Make Picrust2 Output Analysis and Visualization Easier
https://cafferychen777.github.io/ggpicrust2/
MIT License
102 stars 11 forks source link

ggpicrust2(): Error in metadata_mat[, group] : subscript out of bounds #14

Closed yliu3089 closed 1 year ago

yliu3089 commented 1 year ago

Hi Caffery This is Frank from Penn State University. Your development of ggpicrust2 is really impressive! I feel blessed to have such a package that can help visualize the picrust2 output. I encountered a problem while running the package, hope you can help I run picrust2 following the pipeline on https://github.com/picrust/picrust2/wiki/PICRUSt2-Tutorial-%28v2.5.0%29. Then use the output from this pipeline as the input to ggpicrust2. Then I run the ggpicrust2 script you wrote on the website:

   metadata <-
  read_delim(
    "metadata.tsv",
    delim = "\t",
    escape_double = FALSE,
    trim_ws = TRUE
  )

group <- "Enviroment"

daa_results_list <-
  ggpicrust2(
    file = "pred_metagenome_unstrat.tsv",
    metadata = metadata,
    group = "Environment",
    pathway = "KO",
    daa_method = "LinDA",
    p_values_bar = TRUE,
    p.adjust = "BH",
    ko_to_kegg = TRUE,
    order = "pathway_class",
    select = NULL,
    reference = NULL # If your metadata[,group] has more than two levels, please specify a reference.
  )

It gives the following error message: Error in metadata_mat[, group] : subscript out of bounds Do you have any suggestions on how to fix this error? Maybe I need to make some change on the group parameter? Thank you. Best Frank

cafferychen777 commented 1 year ago

Dear Frank,

Thank you for your kind words about ggpicrust2, I'm glad to hear that it has been helpful to you. Regarding the error message you received, it seems that there may be an issue with the "group" parameter in the ggpicrust2 function.

To fix this error, please ensure that the "environment" is a valid column name in the metadata file and that it matches the capitalization and spelling used in the ggpicrust2 function. Additionally, please make sure that the column name is listed exactly as it appears in the metadata file.

For example, if your metadata file has a column named "Environment", you should change the "group" parameter in the ggpicrust2 function to "Environment" (with a capital "E").

Please let me know if this resolves the issue, or if you have any further questions or concerns.

Best regards, Caffery

[image: Mailtrack] https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& Sender notified by Mailtrack https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& 23/04/11 上午09:07:08

yliu3089 @.***> 于2023年4月11日周二 01:17写道:

Hi Caffery This is Frank from Penn State University. Your development of ggpicrust2 is really impressive! I feel blessed to have such a package that can help visualize the picrust2 output. I encountered a problem while running the package, hope you can help I run picrust2 following the pipeline on https://github.com/picrust/picrust2/wiki/PICRUSt2-Tutorial-%28v2.5.0%29. Then use the output from this pipeline as the input to ggpicrust2. Then I run the ggpicrust2 script you wrote on the website:

metadata <- read_delim( "metadata.tsv", delim = "\t", escape_double = FALSE, trim_ws = TRUE )

group <- "Enviroment"

daa_results_list <- ggpicrust2( file = "pred_metagenome_unstrat.tsv", metadata = metadata, group = "Environment", pathway = "KO", daa_method = "LinDA", p_values_bar = TRUE, p.adjust = "BH", ko_to_kegg = TRUE, order = "pathway_class", select = NULL, reference = NULL # If your metadata[,group] has more than two levels, please specify a reference. )

It gives the following error message: Error in metadata_mat[, group] : subscript out of bounds Do you have any suggestions on how to fix this error? Maybe I need to make some change on the group parameter? Thank you. Best Frank

— Reply to this email directly, view it on GitHub https://github.com/cafferychen777/ggpicrust2/issues/14, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATZEQTXEFZQW5AWNDX7KSNTXAQ6DDANCNFSM6AAAAAAWZGOR6U . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Nathanielhubert commented 1 year ago

Firstly, I would like to second Frank's sentiments - this is an awesome package and thank you for your hard work in making it available.

I was having this same problem, Frank, but came across the section of the tutorial that explains if you are having errors, please try...

...and it worked up to the errorbar module.

This is what worked for me:

metadata should be tibble. library(readr) library(ggpicrust2) library(tibble) library(tidyverse) library(ggprism) library(patchwork) metadata <- read_delim("map.txt", delim = "\t", escape_double = FALSE, trim_ws = TRUE )

kegg_abundance <- ko2kegg_abundance("pred_metagenome_unstrat.txt" )

group <- "pre_post_intervention"

daa_results_df <- pathway_daa( abundance = kegg_abundance, metadata = metadata, group = group, daa_method = "ALDEx2", select = NULL, reference = NULL )

But now I have getting the following errors:

daa_results_list <-

  • pathway_errorbar(abundance = kegg_abundance,
  • daa_results_df = daa_annotated_sub_method_results_df,
  • Group = metadata$pre_post_intervention,
  • ko_to_kegg = TRUE,
  • p_values_threshold = 0.0005,
  • order = "pathway_class",
  • select = NULL,
  • p_value_bar = TRUE,
  • colors = NULL,
  • x_lab = NULL
  • ) Error in pathway_errorbar(abundance = kegg_abundance, daa_results_df = daa_annotated_sub_method_results_df, : The feature with statistically significance are more than 30, the visualization will be terrible. Please use select to reduce the number. Now you have "ko05146", "ko00120", "ko05410", "ko00472", "ko00121", "ko05150", "ko04020", "ko01053", "ko00603", "ko00540", "ko00623", "ko04974", "ko00643", "ko00364", "ko00040", "ko03450", "ko00511", "ko00361", "ko04210", "ko04973", "ko03008", "ko03320", "ko00510", "ko00311", "ko00600", "ko03070", "ko00980", "ko05340", "ko01051", "ko00740", "ko05222", "ko05416", "ko05210", "ko04115", "ko00053", "ko05145", "ko00440", "ko00531", "ko04142", "ko00860", "ko00360", "ko00604", "ko00350", "ko00190", "ko00982", "ko00760", "ko00960", "ko00480", "ko00720", "ko00780"

daa_results_list <-

  • pathway_errorbar(abundance = kegg_abundance,
  • daa_results_df = daa_annotated_sub_method_results_df,
  • Group = metadata$pre_post_intervention,
  • ko_to_kegg = TRUE,
  • p_values_threshold = 0.0001,
  • order = "pathway_class",
  • select = NULL,
  • p_value_bar = TRUE,
  • colors = NULL,
  • x_lab = NULL) Warning message: In cbind(nonsense = "nonsense", pathway_class_y = pathway_class_y, : number of rows of result is not a multiple of vector length (arg 2)

I understand the first error is because there are too many pathways for the output figure, but the second error message is confusing me.

I cannot seem to resolve these problems and would like to get the beautiful output associated with this package. I have attached the intermediate results here, if you have any recommendations, they would be greatly appreciated. Thank you, Nate

daa_annotated_sub_method_results_df.txt daa_results_df.txt

cafferychen777 commented 1 year ago

Hello @Nathanielhubert ,

Thank you for your kind words for ggpicrust2! Would you like to share your metadata? So I can reproduce the results and find the reason why the errors occurs.

Best regards,

Nathanielhubert commented 1 year ago

Hello Caffery,

Thank you so much for getting back to me, I have been struggling to get this to work. Most recently, the error I am getting is:

pathway_errorbar <-

  • pathway_errorbar(
  • abundance = kegg_abundance,
  • daa_results_df = daa_Welchs_results_0001_df_annotated,
  • Group = Group,
  • p_values_threshold = 0.0001,
  • order = "pathway_class",
  • select = NULL,
  • ko_to_kegg = TRUE,
  • p_value_bar = TRUE,
  • colors = NULL,
  • x_lab = "pathway_name"
  • ) Warning message: In cbind(nonsense = "nonsense", pathway_class_y = pathway_class_y, : number of rows of result is not a multiple of vector length (arg 2)

I have attached the "kegg_abundnace.txt" and "daa_Welchs_results_0001_df_annotated.txt" here

Any help would be greatly appreciated! Thank you, Nate

On Wed, May 3, 2023 at 11:34 PM Caffery Yang @.***> wrote:

Hello @Nathanielhubert https://github.com/Nathanielhubert ,

Thank you for your kind words for ggpicrust2! Would you like to share your metadata? So I can reproduce the results and find the reason why the errors occurs.

Best regards,

— Reply to this email directly, view it on GitHub https://github.com/cafferychen777/ggpicrust2/issues/14#issuecomment-1534084502, or unsubscribe https://github.com/notifications/unsubscribe-auth/AW6X7YLWIDKXOZVO5HRL3OLXEMWW5ANCNFSM6AAAAAAWZGOR6U . You are receiving this because you were mentioned.Message ID: @.***>

cafferychen777 commented 1 year ago

Hello @Nathanielhubert ,

Thank you for your response! But the txt can't reveal right in Github straightly if you send it with email. Could you upload it on Github?

Best Regards,

Nathanielhubert commented 1 year ago

Yes, Thank you for your help and sorry for the confusion. Does this work?

kegg_abundance.txt daa_Welchs_results_0001_df_annotated.txt

cafferychen777 commented 1 year ago

OK, It works well.

Nathanielhubert commented 1 year ago

could the row names be formatted differently?

cafferychen777 commented 1 year ago

@Nathanielhubert It works well. But I still need the metadata file. 😂

Nathanielhubert commented 1 year ago

QIIME_map_comp4_prePostOnly.txt

cafferychen777 commented 1 year ago

Hello @Nathanielhubert ,

It works well in my mac. You can check the photo and the code. image

About the warning messages Warning messages: 1: In cbind(nonsense = "nonsense", pathway_class_y = pathway_class_y, : number of rows of result is not a multiple of vector length (arg 2) , it's nonsense.

截屏2023-05-04 14 50 28

Best regards, Nathanielhubert.zip

cafferychen777 commented 1 year ago

Hello @Nathanielhubert , the warning is just a lazy way of replacing a vector with a string in one place, which doesn't have any effect on the result.

cafferychen777 commented 1 year ago

The zip include the results photo, code and dataset.

Nathanielhubert commented 1 year ago

think I have it working! Thank you! Is there a table that has the LFC data and error bar data?

cafferychen777 commented 1 year ago

I notice a little tiny wrong appearance of p adjust. You can add one line code to avoid it.

library(readr)
library(ggpicrust2)
library(tidyverse)
library(patchwork)
library(ggprism)
daa_Welchs_results_0001_df_annotated <-
  read.delim(
    "~/Microbiome/ggpicrust2总/ggpicrust2测试/ggpicrust2_test/Nathanielhubert/daa_Welchs_results_0001_df_annotated.txt"
  )

kegg_abundance <-
  read.delim(
    "~/Microbiome/ggpicrust2总/ggpicrust2测试/ggpicrust2_test/Nathanielhubert/kegg_abundance.txt",
    row.names = 1
  )

metadata <-
  read_delim(
    "Nathanielhubert/QIIME_map_comp4_prePostOnly.txt",
    delim = "\t",
    escape_double = FALSE,
    trim_ws = TRUE
  )

daa_Welchs_results_0001_df_annotated$p_adjust <- round(daa_Welchs_results_0001_df_annotated$p_adjust,5)

p <- pathway_errorbar(
  abundance = kegg_abundance,
  daa_results_df = daa_Welchs_results_0001_df_annotated,
  Group = metadata$pre_post_intervention,
  p_values_threshold = 0.0001,
  order = "pathway_class",
  select = NULL,
  ko_to_kegg = TRUE,
  p_value_bar = TRUE,
  colors = NULL,
  x_lab = "pathway_name"
)

I don't design it but I can find it in the source code. Give me some time.

cafferychen777 commented 1 year ago

errorbar.csv log2fold.csv @Nathanielhubert

Nathanielhubert commented 1 year ago

Thank you! Can you show me how to get those for myself for the future? And means as well? (sorry, if too much, I understand and will work to figure it out for the future) Thank you so much for your help!

cafferychen777 commented 1 year ago

Absolutely no problem. It's very easy thing. You can use the newly added code. Nathanielhubert.R.zip

Nathanielhubert commented 1 year ago

If you have more time...

I got the PCA plot to work as well with: pca_plot <- ggpicrust2::pathway_pca(kegg_abundance, metadata, "pre_post_intervention")

but the heatmap is not working for some reason with a very similar command: heatmap_plot <- ggpicrust2::pathway_heatmap(kegg_abundance, metadata, "pre_post_intervention")

group <- "pre_post_intervention" heatmap_plot <- ggpicrust2::pathway_heatmap(kegg_abundance, metadata, group) Error in xtfrm.data.frame(x) : cannot xtfrm data frames

cafferychen777 commented 1 year ago

"xtfrm.data.frame(x) : cannot xtfrm data frames" is never a error. The problem is the data. You can refer the followings code to build your data. 

#' # Create example functional pathway abundance data
#' abundance_example <- matrix(rnorm(30), nrow = 10, ncol = 3)
#' rownames(abundance_example) <- paste0("Sample", 1:10)
#' colnames(abundance_example) <- c("PathwayA", "PathwayB", "PathwayC")
#'
#' # Create example metadata
#' # Please ensure the sample IDs in the metadata have the column name "sample_name"
#' metadata_example <- data.frame(sample_name = rownames(abundance_example),
#'                                group = factor(rep(c("Control", "Treatment"), each = 5)))
#'
#' # Create a heatmap
#' heatmap_plot <- pathway_heatmap(t(abundance_example), metadata_example, "group")
#' print(heatmap_plot)

Nathanielhubert.R.zip

And I notice some problem in errorbar plot. You can use the latest version 1.6.3 of ggpicrust2 and the latest code.

Nathanielhubert commented 1 year ago

Thank you so much!