Hello again! Thanks for the quick responsiveness on my previous question. Now that I have the software working, I'm playing around with the interpretation of MGS data coming out of our MicrobiomeHelper pipeline (https://github.com/LangilleLab/microbiome_helper/wiki) that we use for our own data + offer to clients of our core - if I can get KEGGCharter to work well, we might like to include this in our new MH ver2.0 that might be coming along in 2024. We already are developing a tool for visualizing the stratified output (JarrVis: https://github.com/dhwanidesai/JarrVis) using interactive Sankey diagrams and KEGGCharter could be a nice complement to that for the metabolic maps part, since we don't have a good visualizer for that now (plus we could write some nice scripts to convert our pipeline data to "talk" between the two).
That being said, I have a few questions regarding how the quantifications are handled (PS: there also seem to be some legacy references to --genomic-columns when I think you mean -qcol) - I checked the paper and your wiki here, but there are a few things I wanted to ask and thought would be nice to have them here for other people to see. I initially started playing around with my full data file for input (only using the first two samples to start) when I encountered an issue that the color scale in the "MT" mode didn't seem to match the input RPKM values and so I made a little mock-up example file instead to be able to test and ask the below questions:
...for this small test example, I've simply restricted to 3 of the EC numbers corresponding to the Nitrogen Metabolism pathway, which were in our original dataset and are of particular interest to us. After I run this through KC (keggcharter -f TestInput-forKEGGCharter.txt -o KC_test_run -it 'MirallesMGS-CEMEX2018' --map-all -t 40 -ecc 'EC' -qcol 'N1,N2' -mm 00910), I get the following output in the KEGGCharter_results.tsv:
Hello again! Thanks for the quick responsiveness on my previous question. Now that I have the software working, I'm playing around with the interpretation of MGS data coming out of our MicrobiomeHelper pipeline (https://github.com/LangilleLab/microbiome_helper/wiki) that we use for our own data + offer to clients of our core - if I can get KEGGCharter to work well, we might like to include this in our new MH ver2.0 that might be coming along in 2024. We already are developing a tool for visualizing the stratified output (JarrVis: https://github.com/dhwanidesai/JarrVis) using interactive Sankey diagrams and KEGGCharter could be a nice complement to that for the metabolic maps part, since we don't have a good visualizer for that now (plus we could write some nice scripts to convert our pipeline data to "talk" between the two).
That being said, I have a few questions regarding how the quantifications are handled (PS: there also seem to be some legacy references to
--genomic-columns
when I think you mean-qcol
) - I checked the paper and your wiki here, but there are a few things I wanted to ask and thought would be nice to have them here for other people to see. I initially started playing around with my full data file for input (only using the first two samples to start) when I encountered an issue that the color scale in the "MT" mode didn't seem to match the input RPKM values and so I made a little mock-up example file instead to be able to test and ask the below questions:...for this small test example, I've simply restricted to 3 of the EC numbers corresponding to the Nitrogen Metabolism pathway, which were in our original dataset and are of particular interest to us. After I run this through KC (
keggcharter -f TestInput-forKEGGCharter.txt -o KC_test_run -it 'MirallesMGS-CEMEX2018' --map-all -t 40 -ecc 'EC' -qcol 'N1,N2' -mm 00910
), I get the following output in the KEGGCharter_results.tsv:<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
Function | N1 | N2 | Taxon (KEGGCharter) | KO (ec-column) | EC (ec-column) | KO (KEGGCharter) | EC number (KEGGCharter) -- | -- | -- | -- | -- | -- | -- | -- 1.7.1.4 | 4000 | 2000 | MirallesMGS-CEMEX2018 | K00361,K17877,K26138,K26139 | 1.7.1.4 | K00361,K17877,K26138,K26139 | 1.7.1.4 1.7.2.5 | 400 | 200 | MirallesMGS-CEMEX2018 | | | | 1.7.2.5 1.9.6.1 | 40 | 20 | MirallesMGS-CEMEX2018 | | | | 1.9.6.1 | | | | K02567 | 1.9.6.1 | | | | | | K04561 | 1.7.2.5 | |