orpheum translate assigns all reads to non-coding with a jaccard index of 0

Environment (orpheum version 1.0.5.dev22+ga76c3f3, sourmash version 4.2.1, khmer version 2.1.1)

channels:
   - conda-forge
   - bioconda
   - defaults
dependencies:
   - sourmash
   - pip
   - pip:
       - git+https://github.com/czbiohub/orpheum@master

Code:

orpheum index --molecule protein --peptide-ksize 10 --save-as tmp_pan_genome_reference_k10.nodegraph pan_genome_reference.faa

orpheum translate --peptide-ksize 10  --peptides-are-bloom-filter --noncoding-nucleotide-fasta tmp.nuc_noncoding --coding-nucleotide-fasta tmp.nuc --csv tmp.csv --json-summary tmp.json tmp_pan_genome_reference_k10.nodegraph ../outputs/rgnv_sgc_original_results/4000_GCF_900036035.1_RGNV35913_genomic.fna.gz.cdbg_ids.reads.fa.gz > tmp.pep

Also tried it removing --peptides-are-bloom-filter and feeding it the protein sequences directly.

I ran these commands on ~600 fastq files, and they all produced empty *pep files and full *nuc_noncoding files. All of the reads in nuc noncoding have a jaccard of 0

>SRR2145359.148924__HWI-ST431:257:C1DT9ACXX:6:1101:8013:97867/1__translation_frame:2__jaccard:0.0
AATATTCTCACAACTTACAGACCAGTTCTTGGTCATCAGGAACTATGCGAAAATAACGAATGGTTTGTTTGCCAGGATCATTGGCAAAATTAGTGCACTTA
>SRR2145359.174446__HWI-ST431:257:C1DT9ACXX:6:1102:18950:21974/2__translation_frame:1__jaccard:0.0
ACAATATTCTCACAACTTACAGACCAGTTCTTGGTCATCAGGAACTATGCGAAAATAACGAATGGTTTGTTTGCCAGGATCATTGGCAAAATTAGTGCAC
>SRR2145359.175662__HWI-ST431:257:C1DT9ACXX:6:1102:17565:22992/1__translation_frame:3__jaccard:0.0
TATTGCAGAATGGTAAGTGCACTAATTTTGCCAATGATCCTGGCAAACAAACCATTCGTTATTTTCGCATAGTTCCTGATGACCAAGAACTGGTCTGTAAG
>SRR2145359.175662__HWI-ST431:257:C1DT9ACXX:6:1102:17565:22992/1__translation_frame:-1__jaccard:0.0
TATTGCAGAATGGTAAGTGCACTAATTTTGCCAATGATCCTGGCAAACAAACCATTCGTTATTTTCGCATAGTTCCTGATGACCAAGAACTGGTCTGTAAG
>SRR2145359.237474__HWI-ST431:257:C1DT9ACXX:6:1102:10897:63195/2__translation_frame:-3__jaccard:0.0
AGTGCACTAATTTTGCCAATGATCCTGGCAAACAAACCATTCGTTATTTTCGCATAGTTCCTGATGACCAAGAACTGGTCTGTAAGTTGTGAGAATATTG
>SRR2145359.253930__HWI-ST431:257:C1DT9ACXX:6:1102:8178:73159/1__translation_frame:-3__jaccard:0.0
ACTAATTTTGCCAATGATCCTGGCAAACAAACCATTCGTTATTTTCGCATAGTTCCTGATGACCAAGAACTGGTCTGTAAGTTGTGAGAATATTGTCTCAA
>SRR2145359.256097__HWI-ST431:257:C1DT9ACXX:6:1102:14263:74274/1__translation_frame:3__jaccard:0.0
GAAAGAGGATTGAGACAATATTCTCACAACTTACAGACCAGTTCTTGGTCATCAGGAACTATGCGAAAATAACGAATGGTTTGTTTGCCAGGATCATTGGC
>SRR2145359.257134__HWI-ST431:257:C1DT9ACXX:6:1102:7530:75048/1__translation_frame:2__jaccard:0.0
TAATGAAGTTTACGTATTGCAGAATGGTAAGTGCACTAATTTTGCCAATGATCCTGGCAAACAAACCATTCGTTATTTTCGCATAGTTCCTGATGACCAAG

The protein db I'm working with has 37K protein sequences in it, half of which came directly from fastq files I ran orpheum against (e.g., I megahit assembled the fastq files, prokka predicted protein seqs, and add them to a final fasta file of all of my protein sequences). So both with a k of 7 and 10 I expect many matches. I'm not sure what I'm doing wrong here. Any help would be greatly appreciated!

I'm attaching my db of protein sequences as well as one of my read files. pan_genome_reference.faa.gz 4000_GCF_900036035.1_RGNV35913_genomic.fna.gz.cdbg_ids.reads.fa.gz

czbiohub-sf / orpheum

orpheum translate assigns all reads to non-coding with a jaccard index of 0 #103