czbiohub-sf / orpheum

Orpheum (Previously called and published under sencha) is a Python package for directly translating RNA-seq reads into coding protein sequence.
MIT License
18 stars 4 forks source link

Indexing improvements #74

Closed olgabot closed 4 years ago

olgabot commented 4 years ago

Various indexing improvements to "save the user from themselves" :) aka to stop the creation of the peptide bloom filter if the table size or k-mer size is too small given the dataset.

PR checklist

@bluegenes this "should" help with some semi-useful error messages when creating the index to prevent "bad" indices and thus bad translation results from getting created.

olgabot commented 4 years ago

Need to use sourmash Nodegraph for this to be possible: https://github.com/czbiohub/sencha/issues/77