dib-lab / khmer

In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more
http://khmer.readthedocs.io/
Other
749 stars 294 forks source link

fix case-sensitivity of python-facing k-mer functions #370

Open camillescott opened 10 years ago

camillescott commented 10 years ago

Currently, several functions exposed in python-land are case sensitive in regards to hashing. For example,

  1. get
  2. forward_hash
  3. consume -- not sure

The result is that python code which deals directly with k-mers can fail silently on mixed-case sequence.

Assigning myself. Potential solution: wrap these functions in init.py and call upper(). Seems like a lot of overhead, though I would presume any code working with large numbers of individual k-mers would be moved to c++ land anyway.

ctb commented 10 years ago

On Wed, Apr 02, 2014 at 10:16:41AM -0700, Camille Scott wrote:

Currently, several functions exposed in python-land are case sensitive in regards to hashing. For example,

  1. get
  2. forward_hash
  3. consume -- not sure

The result is that python code which deals directly with k-mers can fail silently on mixed-case sequence.

Assigning myself. Potential solution: wrap these functions in init.py and call upper(). Seems like a lot of overhead, though I would presume any code working with large numbers of individual k-mers would be moved to c++ land anyway.

I think the Python functions in _khmermodule should call uppercase, if they don't do it in the lib/ land.

mr-c commented 10 years ago

@camillescott Any progress?

mr-c commented 9 years ago

@camillescott @ctb Does this need to go into the known-issues list? My understanding is that it doesn't impact script users, only Python/C++ API users.

ctb commented 9 years ago

no need

ctb commented 7 years ago

The command line issue is dealt with in #1435. This issue can remain around for when we start making guarantees about the Python API.