lh3 / miniprot

Align proteins to genomes with splicing and frameshift
https://lh3.github.io/miniprot/
MIT License
310 stars 16 forks source link

Different translation tables #57

Closed giacomomutti closed 4 months ago

giacomomutti commented 4 months ago

Hello! I was wondering if it would be possible to add a feature to miniprot to use different non-standard genetic codes to use during the translation of the reference genome step. It would be extremely useful to annotate genes in certain lineages

Thank you very much for the amazing tool!

lh3 commented 4 months ago

The github HEAD now supports NCBI translation table 1 through 5. If you need rarer tables, let me know.

giacomomutti commented 4 months ago

Thank you very much for the quick implemenation!

I would need rarer tables as I am studying ciliates genome (for example, tables 27, 29 and 30). I transformed all ncbi genetic tables in the format used in nasw-tab.c (AAA,AAC... order) in case it may be useful ( genetic_tables.txt ). However, some indexes are not present (like 7, 8, 17 to 20) and I guessed this would need to be handled in the code somehow but I don't know how to program in C to do a pull request. I hope this is useful and thanks again!

lh3 commented 4 months ago

Added. Thank you so much!

giacomomutti commented 4 months ago

Wow, thank you so much again for the speed. However there is one slight issue. If you use -T 6 it says "[ERROR] failed to find translation table 6" but it should be there. The same happens for -T 33. I guess there is a problem with indexing (it seems that to use table 6 you need to input -T 5). Indeed if you run -T 8 (which should return error), it works, I guess using table 9.

lh3 commented 4 months ago

That is a bug. Now fixed. Thanks!

george-coulouris commented 1 month ago

Hi @lh3!

Can you add support for translation table 25?

Thanks!

lh3 commented 1 month ago

it should be there