cafferychen777 / ggpicrust2

Make Picrust2 Output Analysis and Visualization Easier
https://cafferychen777.github.io/ggpicrust2/
Other
91 stars 11 forks source link

'pathway_daa' function - Undefined colums selects #68

Open gonzalofe opened 7 months ago

gonzalofe commented 7 months ago

Hello, I'm new to picrust2 output analysis and I'm having an error when performing pathwat_daa function with the EC pathway workflow. This is the code I ran:

Workflow for MetaCyc Pathway and EC

Load MetaCyc pathway abundance and metadata

Load metacyc or ec abundance as data frame

EC_abundance <- read.delim("C:/Users/Gonzalo/OneDrive/Documentos/Uni/PIC/pmipprueba 2/pmippruebaggpicrust2/pred_metagenome_unstrat.tsv")

Load metadata as a tibble

data(metadata)

metadata <- read_delim("C:/Users/Gonzalo/OneDrive/Documentos/Uni/PIC/pmipprueba 2/pmippruebaggpicrust2/pmipprueba_metadata.tsv", delim = "\t", escape_double = FALSE, trim_ws = TRUE)

Perform pathway DAA using LinDA method

Please change column_to_rownames() to the feature column if you are not using example dataset

Please change group to "your_group_column" if you are not using example dataset

metacyc_daa_results_df <- pathway_daa(abundance = EC_abundance %>% column_to_rownames("function."), metadata = metadata, group = "sample_id", daa_method = "LinDA")

And I'm getting this error: Error in [.data.frame(LinDA_metadata_df, , matching_columns) : undefined columns selected Also when I select another metadata group this shows:

metacyc_daa_results_df <- pathway_daa(abundance = EC_abundance %>% column_to_rownames("function."), metadata = metadata, group = "age", daa_method = "LinDA") Sample names extracted. Identifying matching columns in metadata... Matching columns identified: sample_id . This is important for ensuring data consistency. Using all columns in abundance. Converting abundance to a matrix... Reordering metadata... Converting metadata to a matrix and data frame... Extracting group information... Running LinDA analysis... Performing LinDA analysis... 0 features are filtered! The filtered data has 2 samples and 1688 features will be tested! Error in if (any(corr.pval <= corr.cut)) { : valor ausente donde TRUE/FALSE es necesario Además: Warning message: In MicrobiomeStat::linda(abundance, LinDA_metadata_df, formula = "~Group_groupnonsense", : Some features have less than 3 nonzero values! They have virtually no statistical power. You may consider filtering them in the analysis!

cafferychen777 commented 7 months ago

Hello @gonzalofe,

Thank you for reaching out with your query regarding picrust2 output analysis. From your message, it appears that you might be working with only two samples in your dataset. It is important to note that Differential Abundance (DA) analysis typically requires a larger number of samples to produce meaningful and statistically significant results.

With just two samples, the statistical power is greatly limited, and most DA analysis methods, including the LinDA method you are using, may not function as intended or produce reliable outcomes. This limitation is likely the reason behind the errors you are encountering.

To proceed effectively, you would need a larger dataset with more samples. If you are limited to only these two samples, you might need to reconsider the type of analysis you can perform, as DA analysis may not be feasible in this scenario.

Please let me know if you need further assistance or have any more questions.

Best regards,

Chen YANG

gonzalofe commented 7 months ago

Thank you for your feedback.

El lun., 13 de noviembre de 2023 07:20, Caffery Yang < @.***> escribió:

Hello @gonzalofe https://github.com/gonzalofe,

Thank you for reaching out with your query regarding picrust2 output analysis. From your message, it appears that you might be working with only two samples in your dataset. It is important to note that Differential Abundance (DA) analysis typically requires a larger number of samples to produce meaningful and statistically significant results.

With just two samples, the statistical power is greatly limited, and most DA analysis methods, including the LinDA method you are using, may not function as intended or produce reliable outcomes. This limitation is likely the reason behind the errors you are encountering.

To proceed effectively, you would need a larger dataset with more samples. If you are limited to only these two samples, you might need to reconsider the type of analysis you can perform, as DA analysis may not be feasible in this scenario.

Please let me know if you need further assistance or have any more questions.

Best regards,

Chen YANG

— Reply to this email directly, view it on GitHub https://github.com/cafferychen777/ggpicrust2/issues/68#issuecomment-1807872779, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7VT2BKA2IPDOGMJPM3TQT3YEHX5RAVCNFSM6AAAAAA7IK4EK2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXHA3TENZXHE . You are receiving this because you were mentioned.Message ID: @.***>