Closed bradfordcondon closed 6 years ago
Hi Bradford,
Here is how I dealt with this issue. Not perfect, but I just wanted to get the data into my site.
It has been quite a while but I put together a way to get a hierarchy like file. The big difference is that there is no html in the files I download so pattern matching in the module upload script needs to be modified also to make it work. Here is what I have.
Here is the shell script that I use to get the hierarchy like file.: https://github.com/srobb1/tripal_analysis_kegg/blob/parseNewResultFormat/includes/getData.KEGGKASS.sh
Here is my modified upload script with my changes highlighted: https://github.com/srobb1/tripal_analysis_kegg/commit/0989e67e0b0cb8eedce08c40222753d12519d27a#diff-bb2b21ef7774df8687ff02b0284505c6
I hope this helps, Sofia
Hi Sofia, thanks for the interesting fix. So your script takes as an input the KAAS job ID and an email and generates a pseudo heirarchy?
Presumably some tool does generate the heirarchy files though. Which one? Maybe it isn't available anymore?
The KAAS web tool used to generate the hierarchy files, but after some update they stopped. Are there any standalone tools that give the same data? I know that IPRSCAN will include KEGG and Pathway IDs in their output, but these are not added into chado when the IPRSCAN results are parsed and loaded by the interproscan tripal module.
Oh, and Yes, to your question about job ID and email.
And I think you are being quite liberal in calling this a 'fix' :) It is more a work around.
Yes, @srobb1 is right. The KAAS server stopped providing the KASS hierarchy file. I think this is because KEGG has transitioned to a primarily pay service. I suspect the heir files gave information that they wanted to keep for their pay service.... just a guess on my part. We definitely need to find a new workaround for the KEGG module. Sufficient time has passed that I don't think anyone has any more heir files to upload other than perhaps those created in the way @srobb1 mentions.
If someone has an hierarchy file laying around, it would be great if they could share it so we know what we're comparing to.
I imagine we could reverse engineer the diagram from KEGG term mappings (which we have) . I don't imagine that it would be easy.
Maybe the best solution would be to modify the loader to take a user email and job ID as input to load in the information via Sofia's script?
Oh, and the KEGG/KASS sumbission ID too... Yeah, that sounds like a reasonable approach to me. Our Galaxy module uses the PHP curl library so it wouldn't be unprecedented to include curl code into a Tripal module.
And to answer your question about the availability of a heir file. We have one on our Tripal v2 User's Guide:
If you just want a mapping of KEGG terms to gene ids, KEGG terms could be pulled from Interproscan output. It is now incorporated into the report if --pathways is selected. This would be a nice addition to the terms that are currently pulled from the report. That said, I do like the tree that is produced from the kegg module.
Hi all,
So working with @mestato , I'm going to propose a more streamlined way of doing this. Rather than download and scan the hierarchy every single time, why not store the hierarchy as a CV? If we do this, and we map our features to the hierarchy, then we can use the existing or upcoming cv_browser to display all this. Unfortunately the BRITE hierarchy isn't available as an OBO. BUT, it is available in full as JSON here. So, we'd have to import it as a CV somehow (should be fairly easy right?)
We would then also need the KO to BRITE mappings. Those are also available, for example:
So the module would then look like this:
Now we can browse the KO or BRITE terms, and the features associated, using the CV browser.
What do you all think? Am I missing something important?
I've been using Ghost Koala instead of KAAS. I'm not sure what the relative merits are. But it sounds like this solution will work no matter what you use (KAAS/GhostKoala/IPS) as long as you get sequence to KO term mappings. Which is nice.
Basically the Tripal 3 loader doesnt support the hierarchy files anymore: we instead load the kegg ontology as an OBO, annotate features, and support mapping records across the ontology with TRIPAL_CV_XRAY. CV_xray is available here: https://github.com/statonlab/tripal_cv_xray
I think our new implementation is just stronger across the board, hopefully users will agree. closing.
Hi, would it be possible to include an example upload file and/or some basic instructions for generating the heirarchy file? I've run KEGG's blast kOALA and KAAS and i can't find a way to download something resembling the hierarchy file anywhere on the KEGG site.
Thanks!