Martinsos / edlib

Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
http://martinsos.github.io/edlib
MIT License
493 stars 162 forks source link

Allow bytestring arguments #99

Closed mahmoud closed 4 years ago

mahmoud commented 6 years ago

Hi there!

I noticed that align() expects text arguments and immediately encodes them into bytestrings. In many cases, developers already have bytestrings available, and decoding them just to encode them again is pretty inefficient

I would propose making the lines linked conditional, or moving them to a separate wrapper around an also-public bytestring function.

Other issues/questions:

Excited to have found your library, thanks for all the great work! If I can help with a PR, let me know!

Martinsos commented 6 years ago

Hi @mahmoud, thank you for reaching out :)! Those are very good suggestions, I think I actually wasn't sure at that moment about that part so I decided to leave it default, but it is certainly better to get this fixed with 'utf-8' and reduce possibility of bugs and something going wrong.

Thanks for the offer, PR would be awesome! Just after you @kurtbrose also opened issue #100 with same stuff you mentioned plus some other things, so I thought maybe let him do the PR since it is a superset of yours and then you also check out the PR and help?

Martinsos commented 4 years ago

This got implemented in the meantime, thanks for help!