metagentools / MetaCoAG

🚦🧬 Binning Metagenomic Contigs via Composition, Coverage and Assembly Graphs
https://metacoag.readthedocs.io/en/stable/
GNU General Public License v3.0
57 stars 5 forks source link

KeyError: 'N' in mer2bits #8

Closed krutkinam closed 2 years ago

krutkinam commented 2 years ago

Hello, @Vini2!

I'm getting the following error with a metaspades assembly:

multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/disk1/home/Programs/anaconda3/envs/metacoag/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, *kwds)) File "/disk1/home/Programs/anaconda3/envs/metacoag/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/disk1/home/Programs/MetaCoAG/src/metacoag_utils/feature_utils.py", line 68, in count_kmers bit_mer = mer2bits(seq[i:(i + k)]) File "/disk1/home/Programs/MetaCoAG/src/metacoag_utils/feature_utils.py", line 34, in mer2bits bit_mer = (bit_mer << 2) | nt_bits[c] KeyError: 'N' """

Could you tell me, please, if it is possible to add the ability to work with unknown nucleotides (N) in MetaCoAG?

Vini2 commented 2 years ago

Hello @krutkinam,

I am extremely sorry for getting back late to you.

Currently, MetaCoAG only supports A, T, G and C characters during k-mer counting. I will fix the module to handle N characters as well.

Thank you for pointing this out.

Vini2 commented 2 years ago

Hello @krutkinam,

I have fixed the issue in commit 411f7e227daee499f69ba897f00c86f738f84c8c and now MetaCoAG should be able to handle unknown characters. Please have a try and let me know.

Thank you!

krutkinam commented 2 years ago

Hello, @Vini2! Everything works great, thanks!