dib-lab / dib_rotation

Metagenomics DIB-lab rotation project
https://dib-lab.github.io/dib_rotation/
BSD 3-Clause "New" or "Revised" License
3 stars 8 forks source link

10 Annotating amino acid sequences #49

Open ccbaumler opened 2 years ago

ccbaumler commented 2 years ago

Hello,

Just above the challenge portion of section 10 annotating amino acid sequences the instructions dictate installing KEGGDecoder. When attempting this I received this error:

note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure

 × Encountered error while trying to install package.
╰─> numpy

After some different attempts, I found that KEGGDecoder's installation section suggests operating in python=3.6.

#changed the python version in kofamscan environment to 3.6
(kofamscan) baumlerc@bm14:~/2020_rotation_project/kofamscan$ python --version
Python 3.10.4
(kofamscan) baumlerc@bm14:~/2020_rotation_project/kofamscan$ conda install python=3.6

Best, Colton

ctb commented 2 years ago

suggest creating a new conda environment containing just python=3.6, and then doing the install in there!

taylorreiter commented 2 years ago

we should probably update this to use eggnog mapper in hmm mode, and then pick a new viz

taylorreiter commented 2 years ago

Actually I changed my mind -- eggnog doesn't assign kegg orthologs directly -- instead it assigns a COG term, and then I think uses DB joins to assign possible KOs. This process results in potentially multiple KOs per gene, without the scoring info to determine which has the best score or domain information to tell what part of the protein the KO corresponds to. kofamscan assigns a single kegg ortholog by default (the one with the best match), and can provide more detailed info on request. Having a best match dramatically simplifies downstream analysis for biological interpretation.

we should still probably pick a different viz though, something more stable..shrug.