czbiohub-sf / orpheum

Orpheum (Previously called and published under sencha) is a Python package for directly translating RNA-seq reads into coding protein sequence.
MIT License
18 stars 4 forks source link

Add option to ignore short sequences when building an index #71

Open olgabot opened 4 years ago

olgabot commented 4 years ago

If the k-mer size is "long," e.g. 33 as may be necessary for a very reduced alphabet like hydrophobic-polar (alphabet size = 2), then some sequences in the provided fasta may be the k-mer length or shorter, which throws an error. Thus, the option to ignore short sequences should be added. I can't think of a joke here to lighten the mood as I have a headache, but hopefully that was clear!