I am starting to test this repo, and I am getting the following issue (ran this twice, same issue):
python src/fcgr.py -k 6 --dir-tarfiles data -w 4
Working on salmonella_enterica__01.tar: 100%|███████████████████████████████████████████████████████████████████████████████████████| 4000/4000 [00:04<00:00, 855.95it/s]
number of tarfiles: 33%|██████████████████████████████████████ | 1/3 [01:15<02:30, 75.06s/it]data/salmonella_enterica__01.tar.xz is done!|██████████████████████████████████████████████████████████████████████████████████████▎| 3966/4000 [00:04<00:00, 720.42it/s]
Working on escherichia_coli__01.tar: 3%|███▍ | 137/4000 [03:32<1:39:44, 1.55s/it]
Working on mycobacterium_tuberculosis__01.tar: 100%|█████████████████████████████████████████████████████████████████████████████████| 4000/4000 [38:10<00:00, 1.75it/s]
number of tarfiles: 67%|██████████████████████████████████████████████████████████████████████████▋ | 2/3 [38:49<22:37, 1357.15s/it]data/mycobacterium_tuberculosis__01.tar.xz is done!
Traceback (most recent call last):
File "src/fcgr.py", line 57, in <module>
for result in executor.map(fcgr.fcgr_from_tar, tarfiles):
File "/home/leandro/miniconda3/lib/python3.8/concurrent/futures/_base.py", line 611, in result_iterator
yield fs.pop().result()
File "/home/leandro/miniconda3/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/home/leandro/miniconda3/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
File "/home/leandro/miniconda3/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/leandro/git/embedding-bacteria/src/fcgr/fcgr_from_tar.py", line 47, in fcgr_from_tar
m = self.__call__(seqbio.seq)
File "/home/leandro/git/embedding-bacteria/venv/lib/python3.8/site-packages/complexcgr/fcgr.py", line 33, in __call__
for kmer, freq in self.freq_kmer.items():
RuntimeError: dictionary changed size during iteration
number of tarfiles: 67%|██████████████████████████████████████████████████████████████████████████▋ | 2/3 [38:49<19:24, 1164.87s/it]
data dir contains 3 .tar.xz files with 4k genomes each:
ls -lh data/
total 341M
-rw-rw-r-- 1 leandro leandro 174M Jan 6 13:56 escherichia_coli__01.tar.xz
drwxrwxr-x 5 leandro leandro 4.0K Jan 6 14:01 fcgr-6mer
-rw-rw-r-- 1 leandro leandro 89M Jan 6 13:53 mycobacterium_tuberculosis__01.tar.xz
-rw-rw-r-- 1 leandro leandro 78M Jan 6 13:55 salmonella_enterica__01.tar.xz
Wondering if I could get some help on this. Can send you the input if needed
Hello!
I am starting to test this repo, and I am getting the following issue (ran this twice, same issue):
data
dir contains 3.tar.xz
files with 4k genomes each:Wondering if I could get some help on this. Can send you the input if needed