Closed Rridley7 closed 1 year ago
Hi @Rridley7 , thanks for the report. The bug in bakta_plot
was a simple unbound variable. It's fixed in https://github.com/oschwengers/bakta/commit/0ad59de1dbd4622179e51c0694d780a3324434b0 and will be available in the next upcoming 1.6.1 patch release. Until then: without --verbose
or --debug
it does not occur.
Regarding the first initial bug. It seems like this is related to Circos. To further debug this, could you provide either the Circos logs that are stored in /tmp/tmp2gi_5r6_
or the genome itself?
Genome file is attached, thanks! S10_4d3374_mtb_idb_t.15.fa.zip
Hi,
I am having a difficult time understanding the output of the circular genome. Is there a legend or manual I may read up on to understand what is being plotted?
Although not visualized here, but what does the third circle mean when you run the COG command? I understand that the extra features are features not present in forward or reverse strand. How is this possible?
@Rridley7 The cause for this is the default value (200) of the Circos max_ideograms
setting preventing it from creating too-crowded figures. Therefore, it fails on genomes having more than 200 contigs.
Unfortunately, I wasn't aware of this, since I only tested it on complete and "better" draft genomes. For now, please use the --skip-plot
option to skip this step. I'll come up with a solution and patch version soon.
@Proelmocan23 Fair point! I'll add a more-elaborated description to the readme, soon. Currently, there are two types of genome plots called feature
and cog
:
Feature: All features are plotted on the two outer rings which represent the forward and reverse strand: coding genes grey, non-coding features in color. The green/red circle represents the GC content per sliding window over the entire sequence(s) with green and red representing GC above and below average, respectively. The yellow/blue most inner circle represents the GC skew - a common plot providing some hints on the replicon replication bubble and hence, on the completeness and correctness of the assembly. On a complete bacterial genome, you normally see two inflection points at the origin of replication and the opposite point on the chromosome -> Wikipedia
COG: All protein-coding genes (CDS) are colored due to COG functional categories. To better distinguish the colored non-coding genes, they are plotted on an additional distinct inner ring. GC content and GC skew follow as described above.
I've added a plot description to the readme. Since all requests/issues are handled, I'll close this issue.
Hi, thanks for the great work on this tool, I have already found it very useful! I have run into an error with the new bakta plot feature, after calling bakta on a genome using default settings, or calling bakta_plot on a previously made json file.
Bakta was installed via mamba (conda) with
mamba install bakta
For the case of calling bakta_plot: The input command:
bakta_plot S12_1a9859_mtb_spa_t.1.json
Returns:
When run with debug flag (not sure if this is the same error):
bakta_plot --debug S12_1a9859_mtb_spa_t.1.json
When run on a full genome: Command:
bakta --db /storage/home/hcoda1/6/rridley3/shared3/DB/bakta/db --debug S10_4d3374_mtb_idb_t.15.fa
The output of this is attached, however the error is the same message as the first.bakta_debug.txt