Closed samnooij closed 2 years ago
Dear @samnooij ,
thanks for the heads-up and very detailed report. We recognized this bug just yesterday which occurs only on few genomes. We've pinpoined this to cases in which short very-proximate CDS are encoded in the elongated 5'
/3'
regions of each other, and we have a patch for this #130
We're working on this and plan to release a patch in v1.5.1 soon.
I have been running Bakta on >200 genome sequences from bacterial isolates and metagenome-assembled genomes. On all but one this has worked flawlessly. On that one genome, I get an IndexError (see below.) This appears to happen in the detect pseudogenes stage. My guess would be that maybe there is an alignment of length 0, and that therefore any index number is out of range. Would that be possible? As a work-around I have now run the same command with the
--skip-pseudo
flag. That works without problems.This error occurs when I run bakta as:
(Where variables in {curly braces} indicate variables from Snakemake. As I said, this worked okay with all other genomes.)
I installed bakta using conda, using this YAML file:
(With the command:
mamba env create -f bakta.yaml
.)I have also tested that one genome with the
--verbose
flag. Please find the resulting log file enclosed. test_bakta_error-logfile.txt