ksahlin / NGSpeciesID

Reference-free clustering and consensus forming of long-read amplicon sequencing
GNU General Public License v3.0
49 stars 14 forks source link

consensus.sh error #5

Closed rkimoakbioinformatics closed 3 years ago

rkimoakbioinformatics commented 3 years ago

OS: Ubuntu 20.04 Installation: conda list shows

Name Version Build Channel _libgcc_mutex 0.1 main
_tflow_select 2.3.0 eigen
absl-py 0.11.0 py38h578d9bd_0 conda-forge aiohttp 3.7.3 py38h25fe258_0 conda-forge astor 0.8.1 pyh9f0ad1d_0 conda-forge astunparse 1.6.3 pyhd8ed1ab_0 conda-forge async-timeout 3.0.1 py_1000 conda-forge attrs 20.3.0 pyhd3deb0d_0 conda-forge bcftools 1.11 h7c999a4_0 bioconda biopython 1.78 py38h25fe258_1 conda-forge blinker 1.4 py_1 conda-forge brotlipy 0.7.0 py38h8df0ef7_1001 conda-forge bzip2 1.0.8 h516909a_3 conda-forge c-ares 1.17.1 h36c2ea0_0 conda-forge ca-certificates 2020.12.5 ha878542_0 conda-forge cachetools 4.2.1 pyhd8ed1ab_0 conda-forge certifi 2020.12.5 py38h578d9bd_1 conda-forge cffi 1.14.5 py38h261ae71_0
chardet 3.0.4 py38h924ce5b_1008 conda-forge click 7.1.2 pyh9f0ad1d_0 conda-forge cryptography 3.2.1 py38h7699a38_0 conda-forge decorator 4.4.2 py_0 conda-forge gast 0.3.3 py_0 conda-forge google-auth 1.24.0 pyhd3deb0d_0 conda-forge google-auth-oauthlib 0.4.1 py_2 conda-forge google-pasta 0.2.0 pyh8c360ce_0 conda-forge grpcio 1.33.2 py38heead2fc_2 conda-forge gsl 2.6 hf94e986_0 conda-forge h5py 2.10.0 nompi_py38h7442b35_105 conda-forge hdf5 1.10.6 nompi_h7c3c948_1111 conda-forge htslib 1.11 hd3b49d5_1 bioconda idna 2.10 pyh9f0ad1d_0 conda-forge importlib-metadata 3.7.0 py38h578d9bd_0 conda-forge intervaltree 3.0.2 py_0 conda-forge k8 0.2.5 he513fc3_0 bioconda keras-preprocessing 1.1.2 pyhd8ed1ab_0 conda-forge krb5 1.17.2 h926e7f8_0 conda-forge ld_impl_linux-64 2.33.1 h53a641e_7
libblas 3.9.0 8_openblas conda-forge libcblas 3.9.0 8_openblas conda-forge libcurl 7.71.1 hcdd3856_3 conda-forge libdeflate 1.6 h516909a_0 conda-forge libedit 3.1.20191231 h14c3975_1
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.5.0 h14aa051_18 conda-forge libgfortran4 7.5.0 h14aa051_18 conda-forge liblapack 3.9.0 8_openblas conda-forge libopenblas 0.3.12 pthreads_hb3c22a3_1 conda-forge libprotobuf 3.13.0.1 h8b12597_0 conda-forge libssh2 1.9.0 hab1572f_5 conda-forge libstdcxx-ng 9.1.0 hdf63c60_0
lz4-c 1.9.2 he1b5a44_3 conda-forge mappy 2.17 py38h197edbe_2 bioconda markdown 3.3.4 pyhd8ed1ab_0 conda-forge medaka 1.2.2 py38h64b100c_0 bioconda minimap2 2.17 hed695b0_3 bioconda multidict 4.7.5 py38h1e0a361_2 conda-forge ncurses 6.2 he6710b0_1
networkx 2.5 py_0 conda-forge ngspeciesid 0.1.1.1 dev_0 numpy 1.19.4 py38hf0fd68c_1 conda-forge oauthlib 3.0.1 py_0 conda-forge ont-fast5-api 3.3.0 py_0 bioconda openblas 0.3.12 pthreads_h43bd3aa_1 conda-forge openssl 1.1.1j h27cfd23_0
opt_einsum 3.3.0 py_0 conda-forge packaging 20.9 pyh44b312d_0 conda-forge parasail 1.1.11 pypi_0 pypi perl 5.32.0 h36c2ea0_0 conda-forge pip 21.0.1 py38h06a4308_0
progressbar33 2.4 py_0 conda-forge protobuf 3.13.0.1 py38hadf7658_1 conda-forge pyasn1 0.4.8 py_0 conda-forge pyasn1-modules 0.2.7 py_0 conda-forge pycparser 2.20 pyh9f0ad1d_2 conda-forge pyfaidx 0.5.9.5 pyh3252c3a_0 bioconda pyjwt 2.0.1 pyhd8ed1ab_0 conda-forge pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge pysam 0.16.0.1 py38hbdc2ae9_1 bioconda pysocks 1.7.1 py38h578d9bd_3 conda-forge pyspoa 0.0.3 py38h6ed170a_1 bioconda python 3.8.8 hdb3f193_4
python-edlib 1.3.8.post2 py38hed8969a_0 bioconda python_abi 3.8 1_cp38 conda-forge racon 1.4.20 he513fc3_0 bioconda readline 8.1 h27cfd23_0
requests 2.25.1 pyhd3deb0d_0 conda-forge requests-oauthlib 1.3.0 pyh9f0ad1d_0 conda-forge rsa 4.7.2 pyh44b312d_0 conda-forge samtools 1.11 h6270b1f_0 bioconda scipy 1.5.3 py38h828c644_0 conda-forge setuptools 52.0.0 py38h06a4308_0
six 1.15.0 pyh9f0ad1d_0 conda-forge sortedcontainers 2.3.0 pyhd8ed1ab_0 conda-forge spoa 4.0.7 he513fc3_0 bioconda sqlite 3.33.0 h62c20be_0
tar 1.32 hd4ba37b_0 conda-forge tensorboard 2.4.1 pyhd8ed1ab_0 conda-forge tensorboard-plugin-wit 1.8.0 pyh44b312d_0 conda-forge tensorflow 2.3.0 eigen_py38h71ff20e_0
tensorflow-base 2.3.0 eigen_py38hb57a387_0
tensorflow-estimator 2.4.0 pyh9656e83_0 conda-forge termcolor 1.1.0 py_2 conda-forge tk 8.6.10 hbc83047_0
typing-extensions 3.7.4.3 0 conda-forge typing_extensions 3.7.4.3 py_0 conda-forge urllib3 1.26.3 pyhd8ed1ab_0 conda-forge werkzeug 1.0.1 pyh9f0ad1d_0 conda-forge whatshap 1.0 py38hed8969a_1 bioconda wheel 0.36.2 pyhd3eb1b0_0
wrapt 1.12.1 py38h25fe258_2 conda-forge xopen 1.0.1 py38h578d9bd_0 conda-forge xz 5.2.5 h7b6447c_0
yarl 1.6.3 py38h25fe258_0 conda-forge zipp 3.4.1 pyhd8ed1ab_0 conda-forge zlib 1.2.11 h7b6447c_3
zstd 1.4.5 h6597ccf_2 conda-forge

Installing NGSpeciesID with pip install NGSpeciesID did not work due to dependency (biopython version mismatch). I git cloned this repo and removed version requirements in setup.py. Also, added Python 3.8 in setup.py. Then, I did pip install -e . in the cloned repo folder.

Then,

cd test ./consensus.sh started sorting seqs <multiprocessing.context.SpawnContext object at 0x7f8491520400> Environment set: <multiprocessing.context.SpawnContext object at 0x7f8491520400> Using 8 cores. [376, 376, 376, 376, 376, 376, 376, 368] 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. Time elapesd multiprocessing: 0.7439327239990234 Batch index 0 Batch index 1 Batch index 2 Batch index 3 Batch index 4 Batch index 5 Batch index 6 Batch index 7 3000 reads passed quality critera (avg phred Q val over 7.0 and length > 2*k) and will be clustered. Sorted all reads in 0.8293123245239258 seconds. elapsed time sorting: 0.8303885459899902 Number of reads with read length in interval [700,900]: 2792 Started imported empirical error probabilities of minimizers shared: {(0.15, 0.15): 0.06900003127096205, (0.15, 0.14): 0.07436980320203763, (0.14, 0.15): 0.07436980320203763, (0.15, 0.13): 0.08011057845402268, (0.13, 0.15): 0.08011057845402268, (0.15, 0.12): 0.08992262772234068, (0.12, 0.15): 0.08992262772234068, (0.15, 0.11): 0.09562718186369078, (0.11, 0.15): 0.09562718186369078, (0.15, 0.1): 0.10501923949009649, (0.1, 0.15): 0.10501923949009649, (0.15, 0.09): 0.11603874299408275, (0.09, 0.15): 0.11603874299408275, (0.15, 0.08): 0.1230830045457836, (0.08, 0.15): 0.1230830045457836, (0.15, 0.07): 0.13547908906937794, (0.07, 0.15): 0.13547908906937794, (0.15, 0.06): 0.14787145435046953, (0.06, 0.15): 0.14787145435046953, (0.15, 0.05): 0.163060044027815, (0.05, 0.15): 0.163060044027815, (0.15, 0.04): 0.1760312862321838, (0.04, 0.15): 0.1760312862321838, (0.15, 0.03): 0.19394756744755604, (0.03, 0.15): 0.19394756744755604, (0.15, 0.02): 0.20835904420975648, (0.02, 0.15): 0.20835904420975648, (0.15, 0.01): 0.22725325724648648, (0.01, 0.15): 0.22725325724648648, (0.14, 0.14): 0.08067362420004327, (0.14, 0.13): 0.08986927880788105, (0.13, 0.14): 0.08986927880788105, (0.14, 0.12): 0.0973736696813473, (0.12, 0.14): 0.0973736696813473, (0.14, 0.11): 0.10548059493310644, (0.11, 0.14): 0.10548059493310644, (0.14, 0.1): 0.11488544326464124, (0.1, 0.14): 0.11488544326464124, (0.14, 0.09): 0.12460021175378756, (0.09, 0.14): 0.12460021175378756, (0.14, 0.08): 0.13707157615578078, (0.08, 0.14): 0.13707157615578078, (0.14, 0.07): 0.14971072334871147, (0.07, 0.14): 0.14971072334871147, (0.14, 0.06): 0.16347954539532392, (0.06, 0.14): 0.16347954539532392, (0.14, 0.05): 0.17389091053027578, (0.05, 0.14): 0.17389091053027578, (0.14, 0.04): 0.1909341853770778, (0.04, 0.14): 0.1909341853770778, (0.14, 0.03): 0.21133768015027632, (0.03, 0.14): 0.21133768015027632, (0.14, 0.02): 0.2303850987811759, (0.02, 0.14): 0.2303850987811759, (0.14, 0.01): 0.2502883907521948, (0.01, 0.14): 0.2502883907521948, (0.13, 0.13): 0.09646620714484137, (0.13, 0.12): 0.10722165143577965, (0.12, 0.13): 0.10722165143577965, (0.13, 0.11): 0.11736487779693013, (0.11, 0.13): 0.11736487779693013, (0.13, 0.1): 0.12666028913447397, (0.1, 0.13): 0.12666028913447397, (0.13, 0.09): 0.13626169289563694, (0.09, 0.13): 0.13626169289563694, (0.13, 0.08): 0.1510146738499373, (0.08, 0.13): 0.1510146738499373, (0.13, 0.07): 0.16235455241273927, (0.07, 0.13): 0.16235455241273927, (0.13, 0.06): 0.17589252419922008, (0.06, 0.13): 0.17589252419922008, (0.13, 0.05): 0.19457636332259845, (0.05, 0.13): 0.19457636332259845, (0.13, 0.04): 0.21095155752823802, (0.04, 0.13): 0.21095155752823802, (0.13, 0.03): 0.2316654740693399, (0.03, 0.13): 0.2316654740693399, (0.13, 0.02): 0.25426430831438124, (0.02, 0.13): 0.25426430831438124, (0.13, 0.01): 0.27381362738713794, (0.01, 0.13): 0.27381362738713794, (0.12, 0.12): 0.11321350938312225, (0.12, 0.11): 0.12677126578827763, (0.11, 0.12): 0.12677126578827763, (0.12, 0.1): 0.13756815829666572, (0.1, 0.12): 0.13756815829666572, (0.12, 0.09): 0.14936769814389356, (0.09, 0.12): 0.14936769814389356, (0.12, 0.08): 0.16101062156100887, (0.08, 0.12): 0.16101062156100887, (0.12, 0.07): 0.17922548457795331, (0.07, 0.12): 0.17922548457795331, (0.12, 0.06): 0.1973361631122501, (0.06, 0.12): 0.1973361631122501, (0.12, 0.05): 0.21445978169271368, (0.05, 0.12): 0.21445978169271368, (0.12, 0.04): 0.2314798956015971, (0.04, 0.12): 0.2314798956015971, (0.12, 0.03): 0.2558103659364762, (0.03, 0.12): 0.2558103659364762, (0.12, 0.02): 0.2750917254480078, (0.02, 0.12): 0.2750917254480078, (0.12, 0.01): 0.29869770016359765, (0.01, 0.12): 0.29869770016359765, (0.11, 0.11): 0.13473320347698553, (0.11, 0.1): 0.15170311024719316, (0.1, 0.11): 0.15170311024719316, (0.11, 0.09): 0.16281589668640437, (0.09, 0.11): 0.16281589668640437, (0.11, 0.08): 0.17727554363122605, (0.08, 0.11): 0.17727554363122605, (0.11, 0.07): 0.1962876618096546, (0.07, 0.11): 0.1962876618096546, (0.11, 0.06): 0.21513050481187435, (0.06, 0.11): 0.21513050481187435, (0.11, 0.05): 0.23147658571996169, (0.05, 0.11): 0.23147658571996169, (0.11, 0.04): 0.2550019326089846, (0.04, 0.11): 0.2550019326089846, (0.11, 0.03): 0.2770161161823053, (0.03, 0.11): 0.2770161161823053, (0.11, 0.02): 0.3014710692908317, (0.02, 0.11): 0.3014710692908317, (0.11, 0.01): 0.32954904143612673, (0.01, 0.11): 0.32954904143612673, (0.1, 0.1): 0.16493794780015514, (0.1, 0.09): 0.1793464485496695, (0.09, 0.1): 0.1793464485496695, (0.1, 0.08): 0.19412938710322855, (0.08, 0.1): 0.19412938710322855, (0.1, 0.07): 0.21220384752996435, (0.07, 0.1): 0.21220384752996435, (0.1, 0.06): 0.23013944735758568, (0.06, 0.1): 0.23013944735758568, (0.1, 0.05): 0.2539840547315898, (0.05, 0.1): 0.2539840547315898, (0.1, 0.04): 0.2793076214084547, (0.04, 0.1): 0.2793076214084547, (0.1, 0.03): 0.30398706286435234, (0.03, 0.1): 0.30398706286435234, (0.1, 0.02): 0.3285679984605592, (0.02, 0.1): 0.3285679984605592, (0.1, 0.01): 0.3645472951717581, (0.01, 0.1): 0.3645472951717581, (0.09, 0.09): 0.19490441408150871, (0.09, 0.08): 0.21343273999845766, (0.08, 0.09): 0.21343273999845766, (0.09, 0.07): 0.23345613516198677, (0.07, 0.09): 0.23345613516198677, (0.09, 0.06): 0.2555949243833948, (0.06, 0.09): 0.2555949243833948, (0.09, 0.05): 0.2757708657549607, (0.05, 0.09): 0.2757708657549607, (0.09, 0.04): 0.30382753987921485, (0.04, 0.09): 0.30382753987921485, (0.09, 0.03): 0.3302083174800815, (0.03, 0.09): 0.3302083174800815, (0.09, 0.02): 0.36164790419017057, (0.02, 0.09): 0.36164790419017057, (0.09, 0.01): 0.39417639121331705, (0.01, 0.09): 0.39417639121331705, (0.08, 0.08): 0.23358117888006064, (0.08, 0.07): 0.25282322513846106, (0.07, 0.08): 0.25282322513846106, (0.08, 0.06): 0.2779486423047934, (0.06, 0.08): 0.2779486423047934, (0.08, 0.05): 0.3016589472142664, (0.05, 0.08): 0.3016589472142664, (0.08, 0.04): 0.33674868852327805, (0.04, 0.08): 0.33674868852327805, (0.08, 0.03): 0.3641643340415989, (0.03, 0.08): 0.3641643340415989, (0.08, 0.02): 0.3962294793776759, (0.02, 0.08): 0.3962294793776759, (0.08, 0.01): 0.43606624210629497, (0.01, 0.08): 0.43606624210629497, (0.07, 0.07): 0.27548885734726525, (0.07, 0.06): 0.3043683998532819, (0.06, 0.07): 0.3043683998532819, (0.07, 0.05): 0.3337688786325707, (0.05, 0.07): 0.3337688786325707, (0.07, 0.04): 0.36502415821023476, (0.04, 0.07): 0.36502415821023476, (0.07, 0.03): 0.39685094663181153, (0.03, 0.07): 0.39685094663181153, (0.07, 0.02): 0.4357609048359551, (0.02, 0.07): 0.4357609048359551, (0.07, 0.01): 0.4752146432759711, (0.01, 0.07): 0.4752146432759711, (0.06, 0.06): 0.3342343349041469, (0.06, 0.05): 0.3596489860051733, (0.05, 0.06): 0.3596489860051733, (0.06, 0.04): 0.3947204630274997, (0.04, 0.06): 0.3947204630274997, (0.06, 0.03): 0.43593062617437367, (0.03, 0.06): 0.43593062617437367, (0.06, 0.02): 0.4804397742391064, (0.02, 0.06): 0.4804397742391064, (0.06, 0.01): 0.5216464411322426, (0.01, 0.06): 0.5216464411322426, (0.05, 0.05): 0.39934603018543136, (0.05, 0.04): 0.43760103968998043, (0.04, 0.05): 0.43760103968998043, (0.05, 0.03): 0.47983992638266465, (0.03, 0.05): 0.47983992638266465, (0.05, 0.02): 0.5258811968977338, (0.02, 0.05): 0.5258811968977338, (0.05, 0.01): 0.5745789217524078, (0.01, 0.05): 0.5745789217524078, (0.04, 0.04): 0.47796625147241906, (0.04, 0.03): 0.5250382481070871, (0.03, 0.04): 0.5250382481070871, (0.04, 0.02): 0.5733047513418205, (0.02, 0.04): 0.5733047513418205, (0.04, 0.01): 0.6254068590820455, (0.01, 0.04): 0.6254068590820455, (0.03, 0.03): 0.5718208332148733, (0.03, 0.02): 0.6296617637418764, (0.02, 0.03): 0.6296617637418764, (0.03, 0.01): 0.6907686089950577, (0.01, 0.03): 0.6907686089950577, (0.02, 0.02): 0.6925898894627371, (0.02, 0.01): 0.7577911517091893, (0.01, 0.02): 0.7577911517091893, (0.01, 0.01): 0.8285966303522335} 225 elapsed time imported empirical error probabilities of minimizers shared: 0.010323047637939453 started clustring Using total nucleotide batch sizes: [51843, 51820, 51492, 51502, 51414, 51416, 51228, 47558] Nr reads in batches: [63, 63, 63, 63, 63, 63, 63, 59] Environment already set: <multiprocessing.context.SpawnContext object at 0x7f8491520400>

ITERATION 1 Using 8 batches. Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 ccb68eb8-d0ee-4996-bf85-a5cbf62c1777_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=12592_ch=73_start_time=2020-02-10T17:31:38Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_H+(1,8),H-(3,9)_h1
Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 a68005ac-1e7d-45a7-b7fc-8c5855a30fc0_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=22543_ch=95_start_time=2020-02-10T18:44:12Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_h+(0),h-(4)_w1
Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 cb21d161-8645-4fc9-92ef-293b3c72fb7b_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=17757_ch=45_start_time=2020-02-10T18:06:02Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_h+(4),h-(4)_c1
Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 62551b5a-add1-4144-ba26-7499c153677e_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=14339_ch=83_start_time=2020-02-10T17:43:40Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_H-(3,10),H+(4,10)_c1
Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 1371f3e5-b393-4d85-b09f-6318e7177d62_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=5634_ch=84_start_time=2020-02-10T17:46:52Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_H+(5,9),H-(6,10)_c1
Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 2d1f702c-87ca-4cab-b239-deb98e978abf_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=1934_ch=10_start_time=2020-02-10T15:47:21Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_H+(3,10),h-(1,-1)_c1 Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 e7da391c-2c34-4c6e-9aed-8e322aa34d86_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=28224_ch=29_start_time=2020-02-10T19:48:18Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_h-(4),h+(2)_h1
Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 1e619673-436e-4a0d-b74e-e06c346a13a4_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=41760_ch=41_start_time=2020-02-10T22:00:26Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_h-(5),h+(2)_h1
Total number of reads iterated through:63 Passed mapping criteria:43 Passed alignment criteria in this process:16 Total calls to alignment mudule in this process:16 Total number of reads iterated through:63 Passed mapping criteria:17 Passed alignment criteria in this process:43 Total calls to alignment mudule in this process:43 Total number of reads iterated through:63 Passed mapping criteria:34 Passed alignment criteria in this process:23 Total calls to alignment mudule in this process:23 Total number of reads iterated through:63 Passed mapping criteria:26 Passed alignment criteria in this process:33 Total calls to alignment mudule in this process:33 Total number of reads iterated through:59 Passed mapping criteria:21 Passed alignment criteria in this process:32 Total calls to alignment mudule in this process:32 Total number of reads iterated through:63 Passed mapping criteria:28 Passed alignment criteria in this process:28 Total calls to alignment mudule in this process:28 Total number of reads iterated through:63 Passed mapping criteria:32 Passed alignment criteria in this process:26 Total calls to alignment mudule in this process:26 Total number of reads iterated through:63 Passed mapping criteria:21 Passed alignment criteria in this process:39 Total calls to alignment mudule in this process:39 Time elapesd multiprocessing: 0.6918666362762451 New batch Batch index 1 New batch Batch index 2 New batch Batch index 3 New batch Batch index 4 New batch Batch index 5 New batch Batch index 6 New batch Batch index 7 New batch Batch index 8 number of representatives left to cluster: 38 Time elapesd joining clusters: 0.00018024444580078125 creating Supplementary_File2_reads/1 Nr clusters larger than 1: 32 Nr clusters (all): 38 Batches after pairwise consecutive merge: 4 Using total nucleotide batch sizes: [5746, 5691, 8884, 10516] Using nr reads batch sizes: [7, 7, 11, 13]

ITERATION 2 Using 4 batches. Saved: 3 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes Total number of reads iterated through:7 Passed mapping criteria:1 Passed alignment criteria in this process:3 Total calls to alignment mudule in this process:3 Saved: 4 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes Total number of reads iterated through:7 Passed mapping criteria:1 Passed alignment criteria in this process:1 Total calls to alignment mudule in this process:1 Saved: 6 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes Saved: 7 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes Total number of reads iterated through:11 Passed mapping criteria:3 Passed alignment criteria in this process:2 Total calls to alignment mudule in this process:2 Total number of reads iterated through:13 Passed mapping criteria:2 Passed alignment criteria in this process:3 Total calls to alignment mudule in this process:3 Time elapesd multiprocessing: 0.32369494438171387 New batch Batch index 1 New batch Batch index 2 New batch Batch index 3 New batch Batch index 4 number of representatives left to cluster: 22 Time elapesd joining clusters: 0.0001747608184814453 creating Supplementary_File2_reads/2 Nr clusters larger than 1: 18 Nr clusters (all): 22 Batches after pairwise consecutive merge: 2 Using total nucleotide batch sizes: [6519, 11353] Using nr reads batch sizes: [8, 14]

ITERATION 3 Using 2 batches. Saved: 3 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes Saved: 6 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes Total number of reads iterated through:14 Passed mapping criteria:5 Passed alignment criteria in this process:2 Total calls to alignment mudule in this process:2 Total number of reads iterated through:8 Passed mapping criteria:1 Passed alignment criteria in this process:4 Total calls to alignment mudule in this process:4 Time elapesd multiprocessing: 0.278078556060791 New batch Batch index 1 New batch Batch index 2 number of representatives left to cluster: 10 Time elapesd joining clusters: 0.0001876354217529297 creating Supplementary_File2_reads/3 Nr clusters larger than 1: 7 Nr clusters (all): 10 Batches after pairwise consecutive merge: 1 Using total nucleotide batch sizes: [8117] Using nr reads batch sizes: [10]

ITERATION 4 Using 1 batches. Saved: 3 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes Total number of reads iterated through:10 Passed mapping criteria:2 Passed alignment criteria in this process:2 Total calls to alignment mudule in this process:2 Time elapesd clustering last iteration single core: 0.006882190704345703 Time elapsed clustering: 1.3157920837402344 Nr clusters larger than 1: 3 Nr clusters (all): 6

STARTING TO CREATE CLUSTER CONSENSUS

Temporary workdirektory for consensus and polishing: /tmp/tmpmkltdcqw creating center of 247 sequences. creating center of 230 sequences. Detecting and removing primers COIF-ALTfw ACAAATCAYAARGAYATYGG COIF-ALT_ COIR-ALT_fw TTCAGGRTGNCCRAARAAYCA COIR-ALT {'COIF-ALTfw': 'ACAAATCAYAARGAYATYGG', 'COIR-ALT_fw': 'TTCAGGRTGNCCRAARAAYCA', 'COIF-ALT__rc': 'CCRATRTCYTTRTGATTTGT', 'COIR-ALT_rc': 'TGRTTYTTYGGNCAYCCTGAA'} [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 CCGGTGTACTTCGTTCAGTTACGTATTGCTACTTCATAGATTGGACCAATCAGTGTTTCTGTTGGTGCTGATATTGCTTCTCAACCAACCACAAGGATATCGGCACCCTTTATCTCGTATTTGGTGCCTGAGCCGGCATAGTCGGAACAGCCTAAGCCTGCTCATTCGAGCAGAGCTAAGTCAACCTGGTGCACTTCTTGGTGATGATCAAATTTATAATGTAATCGTTACAGCGCACGCTTTCGTAATAATTTTCTTTATAGTAATACCACTAATAATTGGAGGCTTCGGGAACTGACTCATTCCTCTAATGATCGGTGCCCAGATATAGCTTTCCCTCGAATAAATAACATAAGTTTCTGACTTCTTCCTCCATCATCTTTCCTGCTCCTTTTAGCATCATCTGGTGTAGAAGCTGGGGCTGGACAGGTTGAACTGTCTATCCCCTTTAGCTGGAAACCTCGCTCATGCTGGGGCATCTGTTGACCTCACTATTTTTCTCTTCATCTGGCCGGAATTTCATCAATTCTTGGGCAATTAATTTTATTACCACAATTATTAATATAAAACCTCCAGCAATTTCACAATATCAAACTCCCCTCTTTGTTTGAGCAGTCCTAATTACAGCTGTGCTTCTGCTATTATCTCTCCCCGTCTTAGCAGCTGGTATCACAATACTTTTAACTGATCGTAATCTTAATACTTCTTTCTTTGATCCTGCTGGAGGAGGTGACCCCATTTTATATCAACATTTATTCTGATTCTTCGGCCGCAGAAGTCTAGAAGATAGAGCGACAGGCAAGTAGTTCCAGTAGTGCGACACCTAACTCCGCAAGCAACGTACGTAACTT CCGGTGTACTTCGTTCAGTTACGTATTGCTACTTCATAGATTGGACCAATCAGTGTTTCTGTTGGTGCTGATATTGCTTCTCAACCAACCACAAGGATATCGGCACCCTTTATCTCGTATTTGGTGCCTGAGCCGGCATAGTCGGAACAGCCTAAGCCTGCTCATTCGAGCAGAGCTAAGTCAACCTGGTGCACTTCTTGGTGATGATCAAATTTATAATGTAATCGTTACAGCGCACGCTTTCGTAATAATTTTCTTTATAGTAATACCACTAATAATTGGAGGCTTCGGGAACTGACTCATTCCTCTAATGATCGGTGCCCAGATATAGCTTTCCCTCGAATAAATAACATAAGTTTCTGACTTCTTCCTCCATCATCTTTCCTGCTCCTTTTAGCATCATCTGGTGTAGAAGCTGGGGCTGGACAGGTTGAACTGTCTATCCCCTTTAGCTGGAAACCTCGCTCATGCTGGGGCATCTGTTGACCTCACTATTTTTCTCTTCATCTGGCCGGAATTTCATCAATTCTTGGGCAATTAATTTTATTACCACAATTATTAATATAAAACCTCCAGCAATTTCACAATATCAAACTCCCCTCTTTGTTTGAGCAGTCCTAATTACAGCTGTGCTTCTGCTATTATCTCTCCCCGTCTTAGCAGCTGGTATCACAATACTTTTAACTGATCGTAATCTTAATACTTCTTTCTTTGATCCTGCTGGAGGAGGTGACCCCATTTTATATCAACATTTATTCTGATTCTTCGGCCGCAGAAGTCTAGAAGATAGAGCGACAGGCAAGTAGTTCCAGTAGTGCGACACCTAACTCCGCAAGCAACGTACGTAACTT NEW cut start 0 cut end 851 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 CAGCGGTGTACTTCGTTCAGTTACGTATTGCTTCTGGCGGTGTCACTGCTGGACCTACTTGCCTGTCGCTCTATCTTCTAGACTTCTGGGTGGCCGAAAAATCAGAATAAATGTTGATATAAAATGGGTCACCTCCTCCAGCAGGATCAAAGAAAAAAGTATTAAGATTACGATCAGTTAAAAGTATTGTGATACCAGCTGCTAAGACTGGAAGAGATAATAATAGAAGCACAGCTGTAGTAGGACTGCTCAAACAAAAGGGGTGTTTGATATTGTGAAATTGCTGGAGGTTTTATATTAATAATTGTGGTAATAAAATTAATTGCCCAAGAATTGATGAAATTCCTGCTAGATGAAGAGAAAAAATAGTGAGGTCAACAGATGCCCCAGCATGAGCGAGGTTTCCAGCTAAAGGGGATAAACAGTTCAGCCTGTCCAGCAGCTTCTACACCAGATGATGCTAAAAGGAGCAGGAAAGATGGAGGAAGAAGTCAGAAGCTTATGTTATTTATTCGAGGAAAGCTATATCTGGGCACCGATCATTAGAGGAATGAGTCAGTTCCCAAAGCCTCAATTATTAGTGGTATTACTATAAAAAAAATTATTACGCGCGCGTGCTGTAACGATCACATTATAAATTTGATCATCACCAAGAAGTGCACCAGGTTGACTTAGCTCTGCTCGAATGAGCAGGCTTAGAGCTGTTCCGACTATGCGGCTCAGGCACCAAATACGAGATAAAGGGTGCCAATATCTTTGTGGTTGGTTGAGAAGCAATATCAGCACCAACAGAAAAACTGTTCAATCTAATGGATTAGCAATACGTAACGTTCAGC CAGCGGTGTACTTCGTTCAGTTACGTATTGCTTCTGGCGGTGTCACTGCTGGACCTACTTGCCTGTCGCTCTATCTTCTAGACTTCTGGGTGGCCGAAAAATCAGAATAAATGTTGATATAAAATGGGTCACCTCCTCCAGCAGGATCAAAGAAAAAAGTATTAAGATTACGATCAGTTAAAAGTATTGTGATACCAGCTGCTAAGACTGGAAGAGATAATAATAGAAGCACAGCTGTAGTAGGACTGCTCAAACAAAAGGGGTGTTTGATATTGTGAAATTGCTGGAGGTTTTATATTAATAATTGTGGTAATAAAATTAATTGCCCAAGAATTGATGAAATTCCTGCTAGATGAAGAGAAAAAATAGTGAGGTCAACAGATGCCCCAGCATGAGCGAGGTTTCCAGCTAAAGGGGATAAACAGTTCAGCCTGTCCAGCAGCTTCTACACCAGATGATGCTAAAAGGAGCAGGAAAGATGGAGGAAGAAGTCAGAAGCTTATGTTATTTATTCGAGGAAAGCTATATCTGGGCACCGATCATTAGAGGAATGAGTCAGTTCCCAAAGCCTCAATTATTAGTGGTATTACTATAAAAAAAATTATTACGCGCGCGTGCTGTAACGATCACATTATAAATTTGATCATCACCAAGAAGTGCACCAGGTTGACTTAGCTCTGCTCGAATGAGCAGGCTTAGAGCTGTTCCGACTATGCGGCTCAGGCACCAAATACGAGATAAAGGGTGCCAATATCTTTGTGGTTGGTTGAGAAGCAATATCAGCACCAACAGAAAAACTGTTCAATCTAATGGATTAGCAATACGTAACGTTCAGC NEW cut start 0 cut end 836 2 centers formed 0.8902857142857142 2 consensus formed. Saving spoa references to files: Supplementary_File2_reads/consensus_reference_X.fasta running medaka on spoa reference 12. creating Supplementary_File2_reads/medaka_cl_id_12 medaka_consensus -i Supplementary_File2_reads/reads_to_consensus_12.fastq -d Supplementary_File2_reads/consensus_reference_12.fasta -o Supplementary_File2_reads/medaka_cl_id_12 -t 1 Saving medaka reference to file: Supplementary_File2_reads/medaka_cl_id_12/consensus.fasta running medaka on spoa reference 46. creating Supplementary_File2_reads/medaka_cl_id_46 medaka_consensus -i Supplementary_File2_reads/reads_to_consensus_46.fastq -d Supplementary_File2_reads/consensus_reference_46.fasta -o Supplementary_File2_reads/medaka_cl_id_46 -t 1 Saving medaka reference to file: Supplementary_File2_reads/medaka_cl_id_46/consensus.fasta [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 CCGGTGTACTTCGTTCAGTTACGTATTGCTAATTCATAGATTGGACCTATCGAGTGTTTCTGTTGGTGCTGATATTGCTTCTCAACCAACCACAAGGATATCGGCACCCTTTATCTCGTATTTGGTGCCTGAGCCGGCATAGTCGGAACAGCCTAAGCCTGCTCATTCGAGCAGAGCTAAGTCAACCTGGTGCACTTCTTGGTGATGATCAAATTTATAATGTGATCGTTACAGCGCACGCTTTCGTAATAATTTTCTTTATAGTAATACCACTAATAATTGGAGGCTTCGGGAACTGACTCATTCCTCTAATGATCGGTGCCCCAGATATAGCTTTCCCTCGAATAAATAACATAAGCTTCTGACTTCTTCCTCCCATCTTTCCTGCTCCTTTTAGCATCATCTGGTGTAGAAGCTGGGGCTGGACAGGTTGAACTGTCTATCCCCTTTAGCTGGAAACCTCGCTCATGCTGGGGCATCTGTTGACCTCACTATTTTTCTCTTCATCTGGCCGGAATTTCATCAATTCTTGGGGCAATTAATTTTATTACCACAATTATTAATATAAAACCTCCAGCAATTTCACAATATCAAACTCCCCTCTTTGTTTGAGCAGTCCTAATTACAGCTGTGCTTCTACTATTATCTCTCCCCGTCTTAGCAGCTGGTATCACAATACTTTTAACTGATCGTAATCTTAATACTTCTTTCTTTGATCCTGCTGGAGGAGGTGACCCCATTTTATATCAACATTTATTCTGATTCTTCGGCCGCAGAAGTCTAGAAGATAGAGCGACAGGCAAGTAGTTCCAGTAGTGCGACATGCTAACTCCGAAGCAATACGTAACTT CCGGTGTACTTCGTTCAGTTACGTATTGCTAATTCATAGATTGGACCTATCGAGTGTTTCTGTTGGTGCTGATATTGCTTCTCAACCAACCACAAGGATATCGGCACCCTTTATCTCGTATTTGGTGCCTGAGCCGGCATAGTCGGAACAGCCTAAGCCTGCTCATTCGAGCAGAGCTAAGTCAACCTGGTGCACTTCTTGGTGATGATCAAATTTATAATGTGATCGTTACAGCGCACGCTTTCGTAATAATTTTCTTTATAGTAATACCACTAATAATTGGAGGCTTCGGGAACTGACTCATTCCTCTAATGATCGGTGCCCCAGATATAGCTTTCCCTCGAATAAATAACATAAGCTTCTGACTTCTTCCTCCCATCTTTCCTGCTCCTTTTAGCATCATCTGGTGTAGAAGCTGGGGCTGGACAGGTTGAACTGTCTATCCCCTTTAGCTGGAAACCTCGCTCATGCTGGGGCATCTGTTGACCTCACTATTTTTCTCTTCATCTGGCCGGAATTTCATCAATTCTTGGGGCAATTAATTTTATTACCACAATTATTAATATAAAACCTCCAGCAATTTCACAATATCAAACTCCCCTCTTTGTTTGAGCAGTCCTAATTACAGCTGTGCTTCTACTATTATCTCTCCCCGTCTTAGCAGCTGGTATCACAATACTTTTAACTGATCGTAATCTTAATACTTCTTTCTTTGATCCTGCTGGAGGAGGTGACCCCATTTTATATCAACATTTATTCTGATTCTTCGGCCGCAGAAGTCTAGAAGATAGAGCGACAGGCAAGTAGTTCCAGTAGTGCGACATGCTAACTCCGAAGCAATACGTAACTT NEW cut start 0 cut end 850 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 [] -1 CAGCGGTGTACTTCGTTCAGTTACGTATTGCTTCTAGCGGTGTCACTGCTCGGACCTACTTGCCTGTCGCTCTATCTTCTAGACTTCTGGGTGGCCGAAGAATCAGAATAAATGTTGATATAAAATGGGGTCACCTCCTCCAGCAGGATCAAAGAAAAAAGTATTAAGATTACGATCAGTTAAAAGTATTGTGATACCGGCTGCTAAGACTGGAAGAGATAATAGTAGAAGCACAGCTGTAAGTAGGACTGCTCAAACAAAAGGGGTGTTTGATATTGTGAAATTGCTGGAGGTTTTATATTAATAATTGTGGTAATAAAATTAATTGCCCAAGAATTGATGAAATTCCTGCTAGATGAAGAGAAAAAATAGTGAGGTCAACAGATGCCCCAGCATGAGCGAGGTTTCCAGCTAAAGGGGATAGACAGTTCAGCCTGTCCAGCTCCAGCTTCTACACCAGATGATGCTAAAAGGAGCAGGAAAGATGGAGGAAGAAGTCAGAAGCTTATGTTATTTATTCGAGGAAAGCTATATCTGGTGCACCGATCATTAGAGGAATGAGTCAGTTCCCAAAGCCTCAATTATTAGTGGTATTACTATAAAAAAAATTATTACGAGCGTGTGCTGTAACGATCACATTATAAATTTGATCATCACCAAGAAGTGCACCAGGTTGACTTAGCTCTGCTCGAATGAGCAGGCTTAGAGCTGTTCCGACTATGCCGGCTCAGGCACCAAATACGAGATAAAGGGTGCCAATATCCTTGTGGTTGGTTGAGAAGCAATATCAGCACCAACAGAAATCACTGACAGGTTCAATCCTAATGGATTAGCAATACGTAACGTTCAGC CAGCGGTGTACTTCGTTCAGTTACGTATTGCTTCTAGCGGTGTCACTGCTCGGACCTACTTGCCTGTCGCTCTATCTTCTAGACTTCTGGGTGGCCGAAGAATCAGAATAAATGTTGATATAAAATGGGGTCACCTCCTCCAGCAGGATCAAAGAAAAAAGTATTAAGATTACGATCAGTTAAAAGTATTGTGATACCGGCTGCTAAGACTGGAAGAGATAATAGTAGAAGCACAGCTGTAAGTAGGACTGCTCAAACAAAAGGGGTGTTTGATATTGTGAAATTGCTGGAGGTTTTATATTAATAATTGTGGTAATAAAATTAATTGCCCAAGAATTGATGAAATTCCTGCTAGATGAAGAGAAAAAATAGTGAGGTCAACAGATGCCCCAGCATGAGCGAGGTTTCCAGCTAAAGGGGATAGACAGTTCAGCCTGTCCAGCTCCAGCTTCTACACCAGATGATGCTAAAAGGAGCAGGAAAGATGGAGGAAGAAGTCAGAAGCTTATGTTATTTATTCGAGGAAAGCTATATCTGGTGCACCGATCATTAGAGGAATGAGTCAGTTCCCAAAGCCTCAATTATTAGTGGTATTACTATAAAAAAAATTATTACGAGCGTGTGCTGTAACGATCACATTATAAATTTGATCATCACCAAGAAGTGCACCAGGTTGACTTAGCTCTGCTCGAATGAGCAGGCTTAGAGCTGTTCCGACTATGCCGGCTCAGGCACCAAATACGAGATAAAGGGTGCCAATATCCTTGTGGTTGGTTGAGAAGCAATATCAGCACCAACAGAAATCACTGACAGGTTCAATCCTAATGGATTAGCAATACGTAACGTTCAGC NEW cut start 0 cut end 851 0.908883826879271 Detected alignment identidy above threchold for reverse complement. Keeping center with the most read support and adding rc reads to supporting reads. has already been merged, skipping 1 consensus formed. Saving spoa references to files: Supplementary_File2_reads/consensus_reference_X.fasta running medaka on spoa reference 12. creating Supplementary_File2_reads/medaka_cl_id_12 medaka_consensus -i Supplementary_File2_reads/reads_to_consensus_12.fastq -d Supplementary_File2_reads/consensus_reference_12.fasta -o Supplementary_File2_reads/medaka_cl_id_12 -t 1 Saving medaka reference to file: Supplementary_File2_reads/medaka_cl_id_12/consensus.fasta removing temporary workdir started sorting seqs <multiprocessing.context.SpawnContext object at 0x7efc10286400> Environment set: <multiprocessing.context.SpawnContext object at 0x7efc10286400> Using 8 cores. [36, 36, 36, 36, 36, 36, 36, 28] 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. 0 reads processed. Time elapesd multiprocessing: 0.561352014541626 Batch index 0 Batch index 1 Batch index 2 Batch index 3 Batch index 4 Batch index 5 Batch index 6 Batch index 7 274 reads passed quality critera (avg phred Q val over 7.0 and length > 2*k) and will be clustered. Sorted all reads in 0.568681001663208 seconds. elapsed time sorting: 0.5687541961669922 Number of reads with read length in interval [700,900]: 2 Started imported empirical error probabilities of minimizers shared: {(0.15, 0.15): 0.06900003127096205, (0.15, 0.14): 0.07436980320203763, (0.14, 0.15): 0.07436980320203763, (0.15, 0.13): 0.08011057845402268, (0.13, 0.15): 0.08011057845402268, (0.15, 0.12): 0.08992262772234068, (0.12, 0.15): 0.08992262772234068, (0.15, 0.11): 0.09562718186369078, (0.11, 0.15): 0.09562718186369078, (0.15, 0.1): 0.10501923949009649, (0.1, 0.15): 0.10501923949009649, (0.15, 0.09): 0.11603874299408275, (0.09, 0.15): 0.11603874299408275, (0.15, 0.08): 0.1230830045457836, (0.08, 0.15): 0.1230830045457836, (0.15, 0.07): 0.13547908906937794, (0.07, 0.15): 0.13547908906937794, (0.15, 0.06): 0.14787145435046953, (0.06, 0.15): 0.14787145435046953, (0.15, 0.05): 0.163060044027815, (0.05, 0.15): 0.163060044027815, (0.15, 0.04): 0.1760312862321838, (0.04, 0.15): 0.1760312862321838, (0.15, 0.03): 0.19394756744755604, (0.03, 0.15): 0.19394756744755604, (0.15, 0.02): 0.20835904420975648, (0.02, 0.15): 0.20835904420975648, (0.15, 0.01): 0.22725325724648648, (0.01, 0.15): 0.22725325724648648, (0.14, 0.14): 0.08067362420004327, (0.14, 0.13): 0.08986927880788105, (0.13, 0.14): 0.08986927880788105, (0.14, 0.12): 0.0973736696813473, (0.12, 0.14): 0.0973736696813473, (0.14, 0.11): 0.10548059493310644, (0.11, 0.14): 0.10548059493310644, (0.14, 0.1): 0.11488544326464124, (0.1, 0.14): 0.11488544326464124, (0.14, 0.09): 0.12460021175378756, (0.09, 0.14): 0.12460021175378756, (0.14, 0.08): 0.13707157615578078, (0.08, 0.14): 0.13707157615578078, (0.14, 0.07): 0.14971072334871147, (0.07, 0.14): 0.14971072334871147, (0.14, 0.06): 0.16347954539532392, (0.06, 0.14): 0.16347954539532392, (0.14, 0.05): 0.17389091053027578, (0.05, 0.14): 0.17389091053027578, (0.14, 0.04): 0.1909341853770778, (0.04, 0.14): 0.1909341853770778, (0.14, 0.03): 0.21133768015027632, (0.03, 0.14): 0.21133768015027632, (0.14, 0.02): 0.2303850987811759, (0.02, 0.14): 0.2303850987811759, (0.14, 0.01): 0.2502883907521948, (0.01, 0.14): 0.2502883907521948, (0.13, 0.13): 0.09646620714484137, (0.13, 0.12): 0.10722165143577965, (0.12, 0.13): 0.10722165143577965, (0.13, 0.11): 0.11736487779693013, (0.11, 0.13): 0.11736487779693013, (0.13, 0.1): 0.12666028913447397, (0.1, 0.13): 0.12666028913447397, (0.13, 0.09): 0.13626169289563694, (0.09, 0.13): 0.13626169289563694, (0.13, 0.08): 0.1510146738499373, (0.08, 0.13): 0.1510146738499373, (0.13, 0.07): 0.16235455241273927, (0.07, 0.13): 0.16235455241273927, (0.13, 0.06): 0.17589252419922008, (0.06, 0.13): 0.17589252419922008, (0.13, 0.05): 0.19457636332259845, (0.05, 0.13): 0.19457636332259845, (0.13, 0.04): 0.21095155752823802, (0.04, 0.13): 0.21095155752823802, (0.13, 0.03): 0.2316654740693399, (0.03, 0.13): 0.2316654740693399, (0.13, 0.02): 0.25426430831438124, (0.02, 0.13): 0.25426430831438124, (0.13, 0.01): 0.27381362738713794, (0.01, 0.13): 0.27381362738713794, (0.12, 0.12): 0.11321350938312225, (0.12, 0.11): 0.12677126578827763, (0.11, 0.12): 0.12677126578827763, (0.12, 0.1): 0.13756815829666572, (0.1, 0.12): 0.13756815829666572, (0.12, 0.09): 0.14936769814389356, (0.09, 0.12): 0.14936769814389356, (0.12, 0.08): 0.16101062156100887, (0.08, 0.12): 0.16101062156100887, (0.12, 0.07): 0.17922548457795331, (0.07, 0.12): 0.17922548457795331, (0.12, 0.06): 0.1973361631122501, (0.06, 0.12): 0.1973361631122501, (0.12, 0.05): 0.21445978169271368, (0.05, 0.12): 0.21445978169271368, (0.12, 0.04): 0.2314798956015971, (0.04, 0.12): 0.2314798956015971, (0.12, 0.03): 0.2558103659364762, (0.03, 0.12): 0.2558103659364762, (0.12, 0.02): 0.2750917254480078, (0.02, 0.12): 0.2750917254480078, (0.12, 0.01): 0.29869770016359765, (0.01, 0.12): 0.29869770016359765, (0.11, 0.11): 0.13473320347698553, (0.11, 0.1): 0.15170311024719316, (0.1, 0.11): 0.15170311024719316, (0.11, 0.09): 0.16281589668640437, (0.09, 0.11): 0.16281589668640437, (0.11, 0.08): 0.17727554363122605, (0.08, 0.11): 0.17727554363122605, (0.11, 0.07): 0.1962876618096546, (0.07, 0.11): 0.1962876618096546, (0.11, 0.06): 0.21513050481187435, (0.06, 0.11): 0.21513050481187435, (0.11, 0.05): 0.23147658571996169, (0.05, 0.11): 0.23147658571996169, (0.11, 0.04): 0.2550019326089846, (0.04, 0.11): 0.2550019326089846, (0.11, 0.03): 0.2770161161823053, (0.03, 0.11): 0.2770161161823053, (0.11, 0.02): 0.3014710692908317, (0.02, 0.11): 0.3014710692908317, (0.11, 0.01): 0.32954904143612673, (0.01, 0.11): 0.32954904143612673, (0.1, 0.1): 0.16493794780015514, (0.1, 0.09): 0.1793464485496695, (0.09, 0.1): 0.1793464485496695, (0.1, 0.08): 0.19412938710322855, (0.08, 0.1): 0.19412938710322855, (0.1, 0.07): 0.21220384752996435, (0.07, 0.1): 0.21220384752996435, (0.1, 0.06): 0.23013944735758568, (0.06, 0.1): 0.23013944735758568, (0.1, 0.05): 0.2539840547315898, (0.05, 0.1): 0.2539840547315898, (0.1, 0.04): 0.2793076214084547, (0.04, 0.1): 0.2793076214084547, (0.1, 0.03): 0.30398706286435234, (0.03, 0.1): 0.30398706286435234, (0.1, 0.02): 0.3285679984605592, (0.02, 0.1): 0.3285679984605592, (0.1, 0.01): 0.3645472951717581, (0.01, 0.1): 0.3645472951717581, (0.09, 0.09): 0.19490441408150871, (0.09, 0.08): 0.21343273999845766, (0.08, 0.09): 0.21343273999845766, (0.09, 0.07): 0.23345613516198677, (0.07, 0.09): 0.23345613516198677, (0.09, 0.06): 0.2555949243833948, (0.06, 0.09): 0.2555949243833948, (0.09, 0.05): 0.2757708657549607, (0.05, 0.09): 0.2757708657549607, (0.09, 0.04): 0.30382753987921485, (0.04, 0.09): 0.30382753987921485, (0.09, 0.03): 0.3302083174800815, (0.03, 0.09): 0.3302083174800815, (0.09, 0.02): 0.36164790419017057, (0.02, 0.09): 0.36164790419017057, (0.09, 0.01): 0.39417639121331705, (0.01, 0.09): 0.39417639121331705, (0.08, 0.08): 0.23358117888006064, (0.08, 0.07): 0.25282322513846106, (0.07, 0.08): 0.25282322513846106, (0.08, 0.06): 0.2779486423047934, (0.06, 0.08): 0.2779486423047934, (0.08, 0.05): 0.3016589472142664, (0.05, 0.08): 0.3016589472142664, (0.08, 0.04): 0.33674868852327805, (0.04, 0.08): 0.33674868852327805, (0.08, 0.03): 0.3641643340415989, (0.03, 0.08): 0.3641643340415989, (0.08, 0.02): 0.3962294793776759, (0.02, 0.08): 0.3962294793776759, (0.08, 0.01): 0.43606624210629497, (0.01, 0.08): 0.43606624210629497, (0.07, 0.07): 0.27548885734726525, (0.07, 0.06): 0.3043683998532819, (0.06, 0.07): 0.3043683998532819, (0.07, 0.05): 0.3337688786325707, (0.05, 0.07): 0.3337688786325707, (0.07, 0.04): 0.36502415821023476, (0.04, 0.07): 0.36502415821023476, (0.07, 0.03): 0.39685094663181153, (0.03, 0.07): 0.39685094663181153, (0.07, 0.02): 0.4357609048359551, (0.02, 0.07): 0.4357609048359551, (0.07, 0.01): 0.4752146432759711, (0.01, 0.07): 0.4752146432759711, (0.06, 0.06): 0.3342343349041469, (0.06, 0.05): 0.3596489860051733, (0.05, 0.06): 0.3596489860051733, (0.06, 0.04): 0.3947204630274997, (0.04, 0.06): 0.3947204630274997, (0.06, 0.03): 0.43593062617437367, (0.03, 0.06): 0.43593062617437367, (0.06, 0.02): 0.4804397742391064, (0.02, 0.06): 0.4804397742391064, (0.06, 0.01): 0.5216464411322426, (0.01, 0.06): 0.5216464411322426, (0.05, 0.05): 0.39934603018543136, (0.05, 0.04): 0.43760103968998043, (0.04, 0.05): 0.43760103968998043, (0.05, 0.03): 0.47983992638266465, (0.03, 0.05): 0.47983992638266465, (0.05, 0.02): 0.5258811968977338, (0.02, 0.05): 0.5258811968977338, (0.05, 0.01): 0.5745789217524078, (0.01, 0.05): 0.5745789217524078, (0.04, 0.04): 0.47796625147241906, (0.04, 0.03): 0.5250382481070871, (0.03, 0.04): 0.5250382481070871, (0.04, 0.02): 0.5733047513418205, (0.02, 0.04): 0.5733047513418205, (0.04, 0.01): 0.6254068590820455, (0.01, 0.04): 0.6254068590820455, (0.03, 0.03): 0.5718208332148733, (0.03, 0.02): 0.6296617637418764, (0.02, 0.03): 0.6296617637418764, (0.03, 0.01): 0.6907686089950577, (0.01, 0.03): 0.6907686089950577, (0.02, 0.02): 0.6925898894627371, (0.02, 0.01): 0.7577911517091893, (0.01, 0.02): 0.7577911517091893, (0.01, 0.01): 0.8285966303522335} 225 elapsed time imported empirical error probabilities of minimizers shared: 0.008377552032470703 started clustring Using total nucleotide batch sizes: [725, 703, 0] Nr reads in batches: [1, 1, 0] Environment already set: <multiprocessing.context.SpawnContext object at 0x7efc10286400>

ITERATION 1 Using 8 batches. Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 5f836543-6206-45bf-9f65-3b06e806aba3_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=15_ch=94_start_time=2020-02-10T15:32:14Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_H+(0,10),H-(4,8)_h1
Total number of reads iterated through:1 Passed mapping criteria:0 Passed alignment criteria in this process:0 Total calls to alignment mudule in this process:0 Saved: 0 iterations. Iteration NrClusters MinDbSize CurrReadId ClusterSizes 0 0 0 bf219b30-e37d-443d-ae29-10c653aafa4e_runid=ed1de13037eb1bcfeefc20d46af38743bdac68e6_read=649_ch=14_start_time=2020-02-10T15:38:11Z_flow_cell_id=ACE547_protocol_group_id=barcode_test_sample_id=barcodes_fish_H+(3,13),H-(4,10)_h1
Total number of reads iterated through:1 Passed mapping criteria:0 Passed alignment criteria in this process:0 Total calls to alignment mudule in this process:0 multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/rick/miniconda3/envs/ng/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, kwds)) File "/home/rick/miniconda3/envs/ng/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar return list(map(args)) File "/mnt/d/Projects/Oak/mobile_genome_analyzer/NGSpeciesID/modules/parallelize.py", line 17, in reads_to_clusters_helper return cluster.reads_to_clusters(args, kwargs) File "/mnt/d/Projects/Oak/mobile_genome_analyzer/NGSpeciesID/modules/cluster.py", line 224, in reads_to_clusters lowest_batch_index = max(1, min(prev_b_indices)) ValueError: min() arg is an empty sequence """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/rick/miniconda3/envs/ng/bin/NGSpeciesID", line 7, in exec(compile(f.read(), file, 'exec')) File "/mnt/d/Projects/Oak/mobile_genome_analyzer/NGSpeciesID/NGSpeciesID", line 289, in main(args) File "/mnt/d/Projects/Oak/mobile_genome_analyzer/NGSpeciesID/NGSpeciesID", line 87, in main clusters, representatives = parallelize.parallel_clustering(read_array, p_emp_probs, args) File "/mnt/d/Projects/Oak/mobile_genome_analyzer/NGSpeciesID/modules/parallelize.py", line 156, in parallel_clustering cluster_results =res.get(999999999) # Without the timeout this blocking call ignores all signals. File "/home/rick/miniconda3/envs/ng/lib/python3.8/multiprocessing/pool.py", line 771, in get raise self._value ValueError: min() arg is an empty sequence

ksahlin commented 3 years ago

Hi,

Reading the output it looks like the first run is successful and gives one consensus sequence located in Supplementary_File2_reads/medaka_cl_id_12/consensus.fasta As seen from these lines in the output you pasted:

1 consensus formed.
Saving spoa references to files: Supplementary_File2_reads/consensus_reference_X.fasta
running medaka on spoa reference 12.
creating Supplementary_File2_reads/medaka_cl_id_12
medaka_consensus -i Supplementary_File2_reads/reads_to_consensus_12.fastq -d Supplementary_File2_reads/consensus_reference_12.fasta -o Supplementary_File2_reads/medaka_cl_id_12 -t 1
Saving medaka reference to file: Supplementary_File2_reads/medaka_cl_id_12/consensus.fasta

However, then there is more output following after this. I'm unsure whether you started a new run(?) as something seems truncated with this output. At any rate, for this new run, only 2 reads pass the length criteria of [700,900]. See output below. Maybe you could try increasing/change this interval span if you are expecting a different size on your amplicons?

274 reads passed quality critera (avg phred Q val over 7.0 and length > 2*k) and will be clustered.
Sorted all reads in 0.568681001663208 seconds.
elapsed time sorting: 0.5687541961669922
Number of reads with read length in interval [700,900]: 2

I will incorporate a better error message when this happens in the next release of NGSpeciesID.

rkimoakbioinformatics commented 3 years ago

@ksahlin Thanks for the explanation. What I did was running consensus.sh under test folder of this repository (NGSpeciesID). I tried --s 300 instead of the original --s 100 in consensus.sh and it ran without an error.