gmarcais / Jellyfish

A fast multi-threaded k-mer counter
Other
460 stars 136 forks source link

kitsune and jellyfish error "Bloom filter file is truncated" #199

Open rmormando opened 1 year ago

rmormando commented 1 year ago

Hi I am trying to use the kitsune tool which uses jellyfish to calculate the best/optimal kmer size for counting but when I go to run the command I get this error:

Computing Cumulative Relative Entropy (CRE)
  0%|                                                     | 0/6 [00:01<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/Users/rimo/kitsune_env/lib/python3.11/site-packages/kitsune/modules/kitsunejf.py", line 236, in __init__
    datadict[dat[0]] = int(dat[1])
                       ^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: 'terminating'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/Users/rimo/kitsune_env/lib/python3.11/site-packages/kitsune/modules/kopt.py", line 331, in par_cre
    _, a0 = count_kmers_partial(genome, kmin=kmer, kmax=kmer)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rimo/kitsune_env/lib/python3.11/site-packages/kitsune/modules/kopt.py", line 167, in count_kmers
    datadict[kmer] = jf.Kmercount(
                     ^^^^^^^^^^^^^
  File "/Users/rimo/kitsune_env/lib/python3.11/site-packages/kitsune/modules/kitsunejf.py", line 239, in __init__
    raise JellyFishError("Bloom filter file is truncated.")
kitsune.modules.kitsunejf.JellyFishError: Bloom filter file is truncated.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/rimo/kitsune_env/bin/kitsune", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/rimo/kitsune_env/lib/python3.11/site-packages/kitsune/kitsune.py", line 71, in main
    module.run(sys.argv)
  File "/Users/rimo/kitsune_env/lib/python3.11/site-packages/kitsune/modules/kopt.py", line 669, in run
    optimal_kmer_size(
  File "/Users/rimo/kitsune_env/lib/python3.11/site-packages/kitsune/modules/kopt.py", line 587, in optimal_kmer_size
    cre_results.append(job.get())
                       ^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/pool.py", line 774, in get
    raise self._value
kitsune.modules.kitsunejf.JellyFishError: Bloom filter file is truncated.

This is the command I run: kitsune kopt --filenames ./American/american_paths.txt --k-min 10 --k-max 30 --canonical --closely-related --threads 8 --nproc 2 --output ./American/kitsune_out.txt

I have posted this on the kitsune issue page (https://github.com/natapol/kitsune/issues/13) but it looks like it might be a jellyfish issue so I figured I'd post it in here as well to see if I could get a quick solution.

Please let me know if anything in this error message stands out to you!

gmarcais commented 1 year ago

I have never run kitsune. Is it possible to get the command and the data that was used to create the Bloom filter in the first place? Hopefully we can reproduce the issue outside of kitsune.

raphenya commented 23 hours ago

@rmormando check to ensure you have no spaces between your FASTA entries in the input file. That might be the issue.