sanger-pathogens / seroba

k-mer based Pipeline to identify the Serotype from Illumina NGS reads
https://sanger-pathogens.github.io/seroba/
Other
19 stars 16 forks source link

11A/11C Misidentification #55

Closed garfinjm closed 3 years ago

garfinjm commented 4 years ago

Hello!

I work at the Minnesota Department of Health. Internally, have been using Seroba as a replacement for our conventional/molecular serotyping of Strep pneumo for a few months now.

We recently sequenced a handful of 11As (previously serotyped by quellung) that Seroba calls 11C. We’ve tried running Seroba in a number of different environments/containers and it is consistently predicting 11C. Manually mapping the reads the each cps loci seems to give better results for 11A than 11C as well.

Any idea what might be happening here? I'd be happy to share the sequence data privately.

Thanks!

eppinglen commented 4 years ago

Hallo Jake,

thank you for reporting this. Recently, I discovered some issues with the latest version of Ariba, while building the database. Which version are you using? Could you try to install Ariba version 2.11.1 and rebuild the database? (#53)

Best, Lennard

garfinjm commented 3 years ago

Hi Lennard,

Thanks for your response. Our validated Singularity container was pulled from dockerhub on 2019-03-04. It has Seroba 1.0.0, and Ariba 2.13.1

Singularity> seroba version
1.0.0
Singularity> ariba version
WARNING: spades not found in path. Looked for spades.py
ARIBA version: 2.13.1

External dependencies:
bowtie2 2.3.4.3 /usr/bin/bowtie2
cdhit   4.7     /usr/bin/cd-hit-est
nucmer  3.1     /usr/bin/nucmer
spades  NA      NOT_FOUND

External dependencies OK: True

Python version:
3.6.7 (default, Oct 21 2018, 08:08:16)
[GCC 8.2.0]

Python packages:
ariba   2.13.1  /usr/local/lib/python3.6/dist-packages/ariba-2.13.1-py3.6-linux-x86_64.egg/ariba/__init__.py
bs4     4.6.3   /usr/lib/python3/dist-packages/bs4/__init__.py
dendropy        4.4.0   /usr/lib/python3/dist-packages/dendropy/__init__.py
pyfastaq        3.17.0  /usr/lib/python3/dist-packages/pyfastaq/__init__.py
pymummer        0.10.3  /usr/lib/python3/dist-packages/pymummer/__init__.py
pysam   0.14    /usr/lib/python3/dist-packages/pysam/__init__.py

Python packages OK: True

Everything looks OK: True

I built a container with Seroba 1.0.1 (1.0.2 gave an error when I tried to build the database due to the --max_noncoding_length option not being available prior to Ariba 2.14.3) and Ariba 2.11.1. and I'm still seeing the same issue.

Here's an example of the terminal output for one of the samples with the new container.

And the Dockerfile used to build the new container. Dockerfile.txt

Singularity>  seroba runSerotyping /seroba-1.0.1/database/ C19-1659-MN-M05144-200805_S11_L001_1.fastq.gz C19-1659-MN-M05144-200805_S11_L001_2.fastq.gz C19-1659
 -ci4  -m1 -t1
/seroba-1.0.1/build/kmc -k71  -ci4  -m1 -t1 C19-1659-MN-M05144-200805_S11_L001_1.fastq.gz /data/temp.kmcaxfnbfku/C19-1659 /data/temp.kmcaxfnbfku
**************************************************************************************************************************************************
Stage 1: 100%
Stage 2: 100%
1st stage: 6.41395s
2nd stage: 5.35194s
Total    : 11.7659s
Tmp size : 62MB

Stats:
   No. of k-mers below min. threshold :      2644458
   No. of k-mers above max. threshold :            0
   No. of unique k-mers               :      4483018
   No. of unique counted k-mers       :      1838560
   Total no. of k-mers                :     67983400
   Total no. of reads                 :       457326
   Total no. of super-k-mers          :      2431172
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/01/01 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02843465220902065
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/02/02 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02108450945660248
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/03/03 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.0
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/04/04 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.021444281524926685
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/05/05 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.01430755120758178
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/06A/06A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04149770208164369
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/06B/06B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.025792281142139942
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/06C/06C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04667124227865477
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/06D/06D intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04135853190906601
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/06E/06E intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.0271573939474032
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/06F/06F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.08962465344423118
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/06G/06G intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.07182026300378001
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/07A/07A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.020811959548585664
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/07B/07B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% in2: 100%
in1: 100%
0.01289266346055459
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/07C/07C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% in2: 100%
in1: 100%
0.029398978371916094
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/07F/07F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02083129584352078
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/08/08 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.029250911501491547
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/09A/09A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04616362375779986
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/09L/09L intersect /data/temp.kmcaxfnbfku/inter
in1: 100% in2: 100%
in1: 100%
0.0394415133187492
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/09N/09N intersect /data/temp.kmcaxfnbfku/inter
in1: 100% in2: 100%
in1: 100%
0.055909412597310686
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/09V/09V intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04696978450517072
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/10A/10A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02432279109346192
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/10B/10B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.017455474348967572
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/10C/10C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.040331056227009984
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/10F/10F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.0403125205173659
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/10X/10X intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.020232073787563226
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11A/11A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.4546560345992702
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11B/11B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.07335273487053205
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11C/11C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.08158995815899582
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11D/11D intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.4760778483578862
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11E/11E intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04233870967741935
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11F/11F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.045836979201723096
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/12A/12A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.06419765460559286
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/12B/12B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.013055788590604026
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/12F/12F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.015355589329699701
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/13/13 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02206486691534425
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/14/14 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.024541783162472817
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/15A/15A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% in2: 100%
in1: 100%
0.028593977659057794
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/15B/15B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.023285274778307292
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/15C/15C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.023282465769949936
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/15F/15F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% in2: 100%
in1: 100%
0.034544573643410854
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/16A/16A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.016145756184268997
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/16F/16F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.048317046688382194
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/17A/17A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02088080732920988
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/17F/17F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.020780946054941404
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/18A/18A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02301938474504846
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/18B/18B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.011554414378826783
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/18C/18C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.011554414378826783
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/18F/18F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.15003093912132895
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/19A/19A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.09559167837313533
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/19B/19B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.021046493134708157
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/19C/19C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.022117476432197244
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/19F/19F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02496626180836707
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/20/20 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.034039854114910936
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/21/21 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.019122209225448814
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/22A/22A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.020024067388688328
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/22F/22F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.043052270862493365
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/23A/23A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.019714407502131288
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/23B/23B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04966668362933184
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/23B1/23B1 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.0717988714043217
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/23F/23F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.015066216288670848
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/24A/24A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.019219004893964112
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/24B/24B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.026887926887926888
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/24F/24F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02308580223162755
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/25A/25A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.015383722280274004
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/25F/25F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.015383722280274004
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/27/27 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.03792094544202857
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/28A/28A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02451373571285342
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/28F/28F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.030356158893953815
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/29/29 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02342979283130549
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/31/31 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.030933492990071348
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/32A/32A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.032581158310257465
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/32F/32F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.03077693677649154
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/33A/33A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.029295126007419726
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/33B/33B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02867337836115713
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/33C/33C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.025859658932065977
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/33D/33D intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.03850083003447836
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/33F/33F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.030284216335540837
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/34/34 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.03313296121368255
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/35A/35A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.0375594833446635
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/35B/35B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.05760759064723822
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/35C/35C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04039888945549323
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/35D/35D intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.056183488664154936
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/35F/35F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.03508910891089109
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/36/36 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.07089415581018155
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/37/37 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.0
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/38/38 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.022662311147407105
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/39/39 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02206531332744925
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/39X/39X intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.020657092268345466
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/40/40 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% in2: 100%
in1: 100%
0.012897282358360202
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/41A/41A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.023675427617855654
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/41F/41F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.02358467485443069
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/42/42 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04068219162558785
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/43/43 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.01530841963079694
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/44/44 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.015828921851250066
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/45/45 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.051021255396878115
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/46/46 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.06063947078280044
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/47A/47A intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04164466737064414
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/47F/47F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.050920289336102556
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/48/48 intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.013836747827952375
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/Swiss_NT/Swiss_NT intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.0
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/alternative_aliB_NT/alternative_aliB_NT intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.0
11D
WARNING: spades not found in path. Looked for spades.py
cluster detected 1 threads available to it
cluster reported completion
Stopping! Signal received: 13
Stopping! Signal received: 13
WARNING: spades not found in path. Looked for spades.py
cluster detected 1 threads available to it
cluster reported completion
cluster_1 detected 1 threads available to it
cluster_1 reported completion
cluster_2 detected 1 threads available to it
cluster_2 reported completion
cluster_3 detected 1 threads available to it
cluster_3 reported completion
Stopping! Signal received: 13
Stopping! Signal received: 13
{'11A': 8, '11B': 4, '11C': 4, '11D': 8, '11E': 8, '11F': 8}
11A
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
gct
{'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}
wcjE
{'genes': [], 'pseudo': ['gct', 'wcjE'], 'allele': [], 'snps': []}
11B
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
11C
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
gct
{'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}
11D
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
gct
{'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}
11E
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
gct
{'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}
11F
{'genes': [], 'pseudo': [], 'allele': [], 'snps': []}
{'11A': 3.0, '11B': 4, '11C': 1.5, '11D': 5.5, '11E': 5.5, '11F': 8}
{'11A': {'genes': [], 'pseudo': ['gct', 'wcjE'], 'allele': [], 'snps': []}, '11B': {'genes': [], 'pseudo': [], 'allele': [], 'snps': []}, '11C': {'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}, '11D': {'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}, '11E': {'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}, '11F': {'genes': [], 'pseudo': [], 'allele': [], 'snps': []}}
['11C']
{'11A': 3.0, '11B': 4, '11C': 1.5, '11D': 5.5, '11E': 5.5, '11F': 8}
{'11A': {'genes': [], 'pseudo': ['gct', 'wcjE'], 'allele': [], 'snps': []}, '11B': {'genes': [], 'pseudo': [], 'allele': [], 'snps': []}, '11C': {'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}, '11D': {'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}, '11E': {'genes': [], 'pseudo': ['gct'], 'allele': [], 'snps': []}, '11F': {'genes': [], 'pseudo': [], 'allele': [], 'snps': []}}
Singularity> cat C19-1659/pred.tsv
C19-1659        11C

Let me know if there is anything else you would like me to try! I'm happy to share the reads as well, I'm beginning to wonder if this might be a wetlab/sequencing issue and not a bioinformatics issue, as these all came off the same run.

eppinglen commented 3 years ago

Here you can see, that you sequencing data has a much higher overlap with serotype 11A (0.45) then with 11C (0.081):

in1: 100% n2: 100%
0.4546560345992702
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11B/11B intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.07335273487053205
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11C/11C intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.08158995815899582
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11D/11D intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.4760778483578862
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11E/11E intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.04233870967741935
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11F/11F intersect /data/temp.kmcaxfnbfku/inter
in1: 100% n2: 100%
0.045836979201723096
/seroba-1.0.1/build/kmc_tools simple /data/temp.kmcaxfnbfku/C19-1659 /seroba-1.0.1/database/kmer_db/11A/11A intersect /data/temp.kmcaxfnbfku/inter

Those in general this means serotype 11A is correct. However, it might also be possible, that you strains have kind of a hybrid genotypic variant, that is showing serotype 11A as phenotype or that it is 11A but with some mutations that are known for 11C. I think, it would help if I you drop me an e-mail (eppingl[at]rki.de) with some of the data.

Best, Lennard

garfinjm commented 3 years ago

It looks like this was probably caused by low/spotty coverage of the second half of the gene cassette for these few samples.

Closing the issue. Thanks again @eppinglen