Closed hmusta closed 11 months ago
Hi Harun, and thank you for your interest in SSHash!
It should be fixed now, as of https://github.com/jermp/sshash/commit/d3ea2c49eb7aad4ed544a525af3e16baab849f53. The problem was due to PTHash, trying to access an empty file on disk.
I tried ./sshash build -i dump.fa -k 12 -m M --check --verbose
for all values M = 1..12 and it works correctly.
Let me know if that is ok for you too.
-Giulio
PS. As a side note, I have also to say that SSHash is not meant to index such tiny examples. For me, the minimum interesting size is a single whole bacterial genomes like Salmonella o E. Coli.
Thank you for fixing this! In the end, these corner cases help us ensure that we don't run into surprises later when working on larger data sets.
Yeah, true. You're welcome!
Hi Giulio,
Thank you for developing and providing this tool!
I was testing it out on some small data sets and noticed that for even values of m it fails to build an index.
I'm testing with the following sequence inputs:
and the following build command
The above command produces this output:
I don't see any errors when I run it using
valgrind
, but the backtrace shows that it's aborting after this line:It does work, however, if I set m = 1,3,5,7, for m>=9 it segfaults or produces other errors. Strangely, I get different behaviour in different environments. It fails for all values of m on my Mac with g++12, but works with the values mentioned on a Linux system with g++9... Also, while trying to explore this, I replaced
CATGTACTAGCTGATCGTAGCTAGCTAGC
withCATGTAGCTGATCGTAGCTAGCTAGC
and it works for even but not odd values of m on my Mac....Please let me know if I can provide any more information to help debug this.