cafferychen777 / ggpicrust2

Make Picrust2 Output Analysis and Visualization Easier
https://cafferychen777.github.io/ggpicrust2/
MIT License
102 stars 11 forks source link

Questions about KEGG pathways #28

Open andressamv opened 1 year ago

andressamv commented 1 year ago

Hi again!

I have two questions about converting KOs to KEGG pathways.

Regardless if I am analyzing KOs or KEGG pathways, I am using the "pred_metagenome_unstrat.tsv" as my input (please let me know if this is correct), and now I want to extract the description of each KO/KEGG pathway together with their abundance. This was easy to do for the KOs, but I am not sure how to do it for the pathways. Can you please help me with that?

I am also struggling with the pathway_annotation for a different project (pathway_annotation(pathway = 'KO', daa_results_df = daa_results_kegg_DNA2_1, ko_to_kegg = TRUE)). Due to the size, I split my dataframe in 3 before using pathway_annotation (df1, df2, and df3). Somehow, df2 and df3 are promptly annotated but not df1. I even left my computer overnight and nothing happened. I have no idea why this is happening since the dataframes are quite similar. Do you have any thoughts on this?

Thanks, Andressa

cafferychen777 commented 1 year ago

Hi Andressa,

Regarding your questions:

  1. Using the "pred_metagenome_unstrat.tsv" as your input is correct. However, please make sure that you have converted the KO abundance to KEGG abundance using the ko2kegg_abundance function. Additionally, when using functions like pathway_annotation and pathway_errorbar, make sure to set the parameter ko_to_kegg = TRUE to ensure the conversion from KO to KEGG pathways.

  2. Obtaining the complete description of each KEGG pathway along with their abundance in the kegg_abundance dataset is challenging due to restrictions imposed by the KEGG database. To address this, I have set the code to retrieve data only for pathways with p-values less than 0.05. Therefore, it might not be feasible to obtain the complete set of KEGG pathway descriptions and their corresponding abundances together. However, if you have the necessary permissions, you may consider contacting the KEGG database to explore the possibility of accessing the complete set of pathway information.

  3. Regarding the issue with pathway_annotation, it's possible that your computer or network is being restricted by the KEGG database. You can try using a different computer or network setup to see if that resolves the problem. Alternatively, you can further split df1 to identify the specific pathway causing the issue. For any pathways that significantly impact the overall annotation, you can manually annotate them directly within the KEGG database.

I hope this helps! Let me know if you have any further questions or concerns.

Best regards, Chen YANG