Open samuell opened 3 weeks ago
Based on some testing with a separate workflow that tries to mimic the steps of microSALT for assembly and resistance blasting, it turns out the reason VIM is not reported is because of too small alignment lenghts in the blasting against the beta-lactamase file from ResFinder db. Because of the thresholds for the report (97% identity and 90% alignment length), the matches therefore don't end up in the report, although some matches are found in the blast results.
I did test a lot of combinations of assemblers (Spades vs Skesa) and options such as (Trimming vs No trimming) and the various assembly mode flags to Spades. I will exclude other flags than --isolate
here though, since we need to use that to fix the arcC
issue with NovaSeq X, but I found this:
--isolate
flag, the alignment length increased a bit for VIM genes when turning off trimming (from ca 433bp to ca 612bp), although it is not above 90% of the full VIM length at 801bp for one of the variants, which is our treshold.Thus, the results suggest that switching the assembler from Spades to Skesa and keeping the trimming active might potentially resolve this issue.
I actually tried replacing Spades with SKESA now (in the branch |/180-fix-missing-vim-genes and now the results look pretty promising.
These are the blast results (801 bp should be the full VIM gene, for most variants):
$ grep blaVIM [...path.hidden...]/results/ACC14648_2024.8.22_18.3.59/ACC14648A3/blast_search/resistance/beta-lactam.txt | awk '( $5 > 97.0 && $12 > 720 ) { print $1 "\t" $5 "\t" $12 }'
blaVIM-4_1_EU581706 99.875 801
blaVIM-54_1_KY508061 99.750 801
blaVIM-40_1_HG934765 99.750 801
blaVIM-43_1_KP096412 99.750 801
blaVIM-37_1_JX982636 99.750 801
blaVIM-28_1_JF900599 99.750 801
blaVIM-19_1_FJ499397 99.750 801
blaVIM-14_1_FJ445404 99.750 801
blaVIM-1_1_Y18050 99.750 801
blaVIM-57_1_MH450217 99.625 801
blaVIM-52_1_KX349731 99.625 801
blaVIM-42_1_KP071470 99.625 801
blaVIM-35_1_JX982634 99.625 801
blaVIM-34_1_JX013656 99.625 801
blaVIM-33_1_JX258134 99.625 801
blaVIM-32_1_JN676230 99.625 801
blaVIM-27_1_HQ858608 99.625 801
blaVIM-26_1_FR748153 99.625 801
blaVIM-39_1_KF131539 99.501 801
blaVIM-29_1_JX311308 99.501 801
blaVIM-5_1_DQ023222 98.002 801
blaVIM-49_1_KU663374 97.878 801
blaVIM-12_1_DQ143913 97.878 801
blaVIM-38_1_KC469971 97.503 801
The report does not include them though, since the scraping depends on the contig naming scheme used in Spades, so that logic needs to be updated.
But based on all I can see, they are now properly detected.
Describe the bug
For some datasets where ResFinder ran on the raw reads report blaVIM beta lactamase resistance genes, there is no report of VIM genes at all in microSALT.
KS reports:
To Reproduce Steps to reproduce the behavior:
24ET500149
with microSALT 3.3.5Expected behavior The report for "24ET500149" should include
blaVIM-4
.Software version (please complete the following information):
Additional context