cafferychen777 / ggpicrust2

Make Picrust2 Output Analysis and Visualization Easier
https://cafferychen777.github.io/ggpicrust2/
MIT License
102 stars 11 forks source link

ko2kegg_abundance(): KO2Kegg abundance returning empty output. #33

Closed lcamVz closed 11 months ago

lcamVz commented 1 year ago

I am running the following:

`kegg_abundance <- ko2kegg_abundance("Dropbox/CSIRO_Picrustrun/picrust2_out_pipeline_CSIRO2/KO_predicted.tsv") Rows: 7130 Columns: 10544
── Column specification ─────────────────────────────────────────────────────────────────────────────────────── Delimiter: "\t" chr (1): sequence dbl (10543): K00001, K00002, K00003, K00004, K00005, K00006, K00007, K00008, K00009, K00010, K00011, K00012...

ℹ Use spec() to retrieve the full column specification for this data. ℹ Specify the column types or set show_col_types = FALSE to quiet this message. Calculation may take a long time, please be patient. The kegg pathway with zero abundance in all the different samples has been removed.

Perform pathway differential abundance analysis (DAA) using ALDEx2 method

Please change group to "your_group_column" if you are not using example dataset

daa_results_df2 <- pathway_daa(abundance = kegg_abundance,

  • metadata = metdata.tab,
  • group = "Treatment",
  • daa_method = "ALDEx2",
  • select = NULL,
  • reference = NULL) Error in metadata[, matching_columns]: ! Can't subset columns with matching_columns. ✖ Subscript matching_columns can't contain missing values. ✖ It has a missing value at location 1.`

I have run metacyc analysis no problem but wanted to see why this could be?

cafferychen777 commented 1 year ago

Dear Camuy,

I apologize for the issue you encountered while running the code. According to the error message you provided, the problem seems to be with the determination of the matching columns, which are used to identify the sample name column in the metadata. I suggest checking if the column names in kegg_abundance match the column that holds the sample names in the metadata by using colnames(kegg_abundance).

To resolve this issue, please follow these steps:

  1. First, ensure that metadata contain the correct sample name column. You can use colnames(kegg_abundance) and metadata to check their column names.

  2. Make sure that the column names in kegg_abundance exactly match the column in the metadata that holds the sample names. Check for any spelling errors or extra spaces.

  3. If the column names match correctly, also ensure that the sample name column in metadata does not contain any missing values. Missing values can cause issues when subsetting columns.

If you continue to experience problems, please provide more information about kegg_abundance and metadata so that I can assist you further.

Thank you!

On Thu, 15 Jun 2023 at 05:45, Lennel Camuy @.***> wrote:

I am running the following:

`kegg_abundance <- ko2kegg_abundance("Dropbox/CSIRO_Picrustrun/picrust2_out_pipeline_CSIRO2/KO_predicted.tsv") Rows: 7130 Columns: 10544 ── Column specification ─────────────────────────────────────────────────────────────────────────────────────── Delimiter: "\t" chr (1): sequence dbl (10543): K00001, K00002, K00003, K00004, K00005, K00006, K00007, K00008, K00009, K00010, K00011, K00012...

ℹ Use spec() to retrieve the full column specification for this data. ℹ Specify the column types or set show_col_types = FALSE to quiet this message. Calculation may take a long time, please be patient. The kegg pathway with zero abundance in all the different samples has been removed.

Perform pathway differential abundance analysis (DAA) using ALDEx2 method Please change group to "your_group_column" if you are not using example dataset

daa_results_df2 <- pathway_daa(abundance = kegg_abundance,

-

                         metadata = metdata.tab,

-

                         group = "Treatment",

-

                         daa_method = "ALDEx2",

-

                         select = NULL,

-

                         reference = NULL)

Error in metadata[, matching_columns]: ! Can't subset columns with matching_columns. ✖ Subscript matching_columns can't contain missing values. ✖ It has a missing value at location 1.`

I have run metacyc analysis no problem but wanted to see why this could be?

— Reply to this email directly, view it on GitHub https://github.com/cafferychen777/ggpicrust2/issues/33, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATZEQTTC67EOUCGYGEEDNMLXLIWI7ANCNFSM6AAAAAAZG6XD6I . You are receiving this because you are subscribed to this thread.Message ID: @.***>

lcamVz commented 1 year ago

Perfect! I check that out.

cafferychen777 commented 1 year ago

Hello @lcamVz ,

Thank you for your input. It's indeed possible that the issue stems from the lack of one-to-one correspondence and alignment between the colnames(kegg_abundance) and the sample names in the metadata. Please double-check this aspect to ensure proper matching.

I apologize for any confusion caused, but you're right that this issue is not directly related to the ggpicrust2 package itself. It's more likely a data mismatch or alignment problem between your abundance data and metadata.

Please review the column names in the kegg_abundance object and compare them with the sample names in the metadata. Ensure that they align correctly and correspond to each other accurately. It's crucial that the sample names in the metadata match the column names in the kegg_abundance object precisely.

If you find any discrepancies or misalignments, please make the necessary adjustments to ensure the proper alignment between the two. This should help resolve the error you encountered.

If you have any further questions or need additional assistance, please let me know.