cafferychen777 / ggpicrust2_paper

https://cafferychen777.github.io/ggpicrust2_paper/
3 stars 2 forks source link

problem "Error in metadata_mat[, group] : subscript out of bounds" when running ggpicrust2 on the example dataset #3

Open jflot opened 3 months ago

jflot commented 3 months ago

Hello, I have just installed ggpicrust2 and am trying to test it using the provided example dataset, but I run into an error:


> library(readr)
> library(ggpicrust2)
> library(tibble)
> library(tidyverse)
> library(ggprism)
> library(patchwork)
> # Load necessary data: abundance data and metadata
> abundance_file <- "~/Downloads/pred_metagenome_unstrat.tsv"
> metadata <- read_delim(
+   "~/Downloads/metadata.txt",
+   delim = "\t",
+   escape_double = FALSE,
+   trim_ws = TRUE
+ )
Rows: 50 Columns: 31                                                                                                                  
── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: "\t"
chr  (27): sample_name, Environment, Group, Assay Type, bacterial_metagenome_source, batch_id, BioProject, BioSample, Center Name, C...
dbl   (3): AvgSpotLen, Bases, Bytes
dttm  (1): ReleaseDate

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
> # Run ggpicrust2 with input file path
> results_file_input <- ggpicrust2(file = abundance_file,
+                                  metadata = metadata,
+                                  group = "your_group_column", # For example dataset, group = "Environment"
+                                  pathway = "KO",
+                                  daa_method = "LinDA",
+                                  ko_to_kegg = TRUE,
+                                  order = "pathway_class",
+                                  p_values_bar = TRUE,
+                                  x_lab = "pathway_name")
Starting the ggpicrust2 analysis...

Converting KO to KEGG...

Loading data from file...
Rows: 4952 Columns: 51                                                                                                                
── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: "\t"
chr  (1): #NAME
dbl (50): SRR11393747, SRR11393768, SRR11393775, SRR11393761, SRR11393755, SRR11393771, SRR11393751, SRR11393765, SRR11393778, SRR11...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Loading KEGG reference data. This might take a while...
Performing KO to KEGG conversion. Please be patient, this might take a while...
  |==============================================================================================================================| 100%
KO to KEGG conversion completed. Time elapsed: 1.48 seconds.
Removing KEGG pathways with zero abundance across all samples...
KEGG abundance calculation completed successfully.
Performing pathway differential abundance analysis...

Sample names extracted.
Identifying matching columns in metadata...
Matching columns identified: sample_name . This is important for ensuring data consistency.
Using all columns in abundance.
Converting abundance to a matrix...
Reordering metadata...
Converting metadata to a matrix and data frame...
Extracting group information...
Error in metadata_mat[, group] : subscript out of bounds```

Any idea what could cause this problem?
Thanks a lot in advance!