jermp / sshash

A compressed, associative, exact, and weighted dictionary for k-mers.
MIT License
84 stars 17 forks source link

No queries being run #9

Closed jnalanko closed 2 years ago

jnalanko commented 2 years ago

Hi again,

I'm trying to run queries on a small sshash index, but the program reports that zero queries were ran. What could be the issue?

niklas@phoenix:~/code/SBWT_experiments$ ./sshash/build/query index.sshash test.fna
2022-06-01 19:02:46: loading index from file 'index.sshash'...
index size: 2.13636 [MB] (7.09913 [bits/kmer])
2022-06-01 19:02:46: performing queries from file 'test.fna'...
2022-06-01 19:02:46: DONE
==== query report:
num_kmers = 0
num_valid_kmers = 0 (-nan% of kmers)
num_positive_kmers = 0 (-nan% of valid kmers)
num_searches = 140332537301184/0 (inf%)
num_extensions = 2/0 (inf%)
elapsed = 0.005 millisec / 5e-06 sec / 8.33333e-08 min / inf ns/kmer

The index and the queries are attached here. I'm not attaching the original sequences because they are over 30GB. data.zip

jermp commented 2 years ago

Oh, just use the proper extension .fa not .fna. Easy fix.

jermp commented 2 years ago

Here https://github.com/jermp/sshash/blob/master/include/query/membership_query.hpp#L107 you can see all the different supported file formats.

From here https://en.wikipedia.org/wiki/FASTA_format, I see there are .fasta, .fna, .ffn, .faa, .frn, .fa. Should probably add the missing ones.

jermp commented 2 years ago

Anyway, running ./query without arguments, tells you what file extensions should be used:

query_filename Must be a FASTA/FASTQ file (.fa/fasta or .fq/fastq extension) compressed with gzip or not.

jnalanko commented 2 years ago

Oh yeah, sorry, it's working now.

jermp commented 2 years ago

You're welcome. Closing this then.