Open LLansing opened 11 months ago
@LLansing Thanks for your interest in the program. To be honest, I don't know the exact answer, but I guess that KEGG might have updated their rules of naming pathways. In the past, there are prefixes like "ko", "map" and "rn". They are almost the same thing. But in some scenarios, one pathway has "map" but not "ko" (for example). I suspect that this discrepancy caused the 16 pathways to miss their KO members. I don't have a robust solution to this. If you are able to manually obtain the membership information from the KEGG website, you may be able to update the database files to include them.
I have generated KEGG KO annotation results (woltka classify) and pathway results (woltka collapse on KO results), and finally pathway coverage results (woltka coverage on KO results with KEGG
pathway-to-ko.txt
mapping file built from thekegg_query.py
helper script).These steps seemed to have worked, but upon comparing KEGG pathways ("maps"), there are a small number that were present in the pathway count results, but were not represented in the coverage results (OR in the
pathway-to-ko.txt
file).All 16 of these discrepant pathways are categorized within KEGG's database as Global and overview maps, Drug resistance: antimicrobial, or Drug resistance: antineoplastic.
Do you know why these categories have been excluded from the
kegg_query.py
output and therefore the coverage results? I am not claiming there is a bug or problem, but I would like to know why to make sure there isn't anything I'm missing.