Closed dukebody closed 8 years ago
Thanks. I've made a branch based on your work here that has some cosmetic changes:
https://github.com/avian2/unidecode/tree/mostly-ascii
I've renamed unidecode_fast
to unidecode_expect_ascii
to make it more clear what it does. I've also added unidecode_expect_nonascii
.
After some thought I also made unidecode
an alias for unidecode_expect_ascii
. As far as I know now, most uses of Unidecide have that usecase, and the slow down for non-ASCII strings is not that high. I still think for most people, performance difference is irrelevant. Which is also why I moved any mention of this to a separate README section.
Can you have a look? I'll merge that to master instead of this pull request.
Thanks Tomaž. I'll try to look into it this week.
Hi Tomaž. I've looked at the code and it seems ok to me.
I like a lot the fact that you modified the tests to test all variants! However note that since unidecode = unidecode_expect_ascii
in the code you are testing almost the same thing twice. But the tests are so fast that I guess the duplicity doesn't matter at all.
So go ahead with the merge. :) Thank you a lot for dedicating some of your time to deal with this feature! I believe it doesn't directly affect your use cases, so I really appreciate your efforts.
I'm closing this pull request in favour of your branch.
See https://github.com/avian2/unidecode/issues/2
This one uses
codecs
andunidecode
as fallback function for non-ASCII chars, which is faster than the previous PR.