kage-genotyper / kage

Alignment-free genotyper for SNPs and short indels, implemented in Python.
GNU General Public License v3.0
48 stars 2 forks source link

kmer_mapper not working #5

Open teepean opened 1 year ago

teepean commented 1 year ago

I am trying to test kage 0.11.14 but cannot for some reason kmer_mapper does not work. The command and error message:

kmer_mapper map -b index_2548all_uncompressed.npz -f 210435_S26_L001_R1_001.fq -o kmer_counts Traceback (most recent call last): File "/home/useri/.local/bin/kmer_mapper", line 8, in sys.exit(main()) File "/home/useri/.local/lib/python3.10/site-packages/kmer_mapper/command_line_interface.py", line 29, in main run_argument_parser(sys.argv[1:]) File "/home/useri/.local/lib/python3.10/site-packages/kmer_mapper/command_line_interface.py", line 169, in run_argument_parser args.func(args) File "/home/useri/.local/lib/python3.10/site-packages/kmer_mapper/command_line_interface.py", line 85, in map_bnp kmer_index = _get_kmer_index_from_args(args) File "/home/useri/.local/lib/python3.10/site-packages/kmer_mapper/util.py", line 51, in _get_kmer_index_from_args kmer_index = IndexBundle.from_file(args.index_bundle).indexes["KmerIndex"] File "/home/useri/.local/lib/python3.10/site-packages/kage/indexing/index_bundle.py", line 20, in getitem return self.index[e] KeyError: 'KmerIndex'

ivargr commented 1 year ago

Hi!

Sorry, there seems to have been a mismatch in the code after I've updated some indexes. I pushed a fix now, so it should be fixed in latest version of kmer_mapper. Could you try updating kmer_mapper and see if it works then? You will need version 0.0.31, so pip install kmer_mapper==0.0.31 should fix it, hopefully.

Thanks for bringing this to my attention :)

teepean commented 1 year ago

Thanks! Looks like it started processing but pops up an error message every now and then:

Process Process-13: Traceback (most recent call last): File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, *self._kwargs) File "/home/useri/.local/lib/python3.10/site-packages/shared_memory_wrapper/shared_array_map_reduce.py", line 51, in call job_result = self.function(data, run_specific_data) File "/home/useri/.local/lib/python3.10/site-packages/kmer_mapper/command_line_interface.py", line 39, in map_cpu hashes = get_kmer_hashes_from_chunk_sequence(chunk_sequence, kmer_size) File "/home/useri/.local/lib/python3.10/site-packages/kmer_mapper/util.py", line 73, in get_kmer_hashes_from_chunk_sequence bnp.as_encoded_array(chunk_sequence, bnp.DNAEncoding), kmer_size).ravel().raw().astype(np.uint64) File "/home/useri/.local/lib/python3.10/site-packages/bionumpy/encoded_array.py", line 542, in as_encoded_array return target_encoding.encode(s) File "/home/useri/.local/lib/python3.10/site-packages/bionumpy/encoded_array.py", line 52, in encode r = self._ragged_array_as_encoded_array(data) File "/home/useri/.local/lib/python3.10/site-packages/bionumpy/encoded_array.py", line 73, in _ragged_array_as_encoded_array data = self.encode(s.ravel()) File "/home/useri/.local/lib/python3.10/site-packages/bionumpy/encoded_array.py", line 59, in encode out = EncodedArray(self._encode(data), self) File "/home/useri/.local/lib/python3.10/site-packages/bionumpy/encodings/alphabet_encoding.py", line 33, in _encode raise EncodingError(f"Error when encoding {''.join(chr(c) for c in byte_array.ravel()[0:100])} " bionumpy.encodings.exceptions.EncodingError: ("Error when encoding TTACCTCAAGGTTATCGACGTGCAGGGAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACCCAACCTATCTCGTATGCCGTCTTCTGCTTGAAAATGG to AlphabetEncoding. Invalid character(s): ['N'][78]", 919813)

ivargr commented 1 year ago

Sorry for the late reply!

This error message should have been clearer, but this is because you have Ns in your sequences. Am I correct that some of your reads have Ns in them?