starwing / luautf8

a utf-8 support module for Lua and LuaJIT.
MIT License
412 stars 68 forks source link

Efficient use of find/sub #25

Open johnd0e opened 5 years ago

johnd0e commented 5 years ago

Consider such example:

  repeat
    pos = utf8.find(str,"\\",pos+1)
  until not pos or utf8.sub(str,1,pos) ~= utf8.sub(str2,1,pos)

I have some doubt about efficiency of this code.

  1. find in loop pos value have to be translated to byte offset, and this get even worse with every iteration as we move further from beginning. So question: is there some internal optimization for loop usage?
  2. sub after find It has to repeat exactly the same translation of pos, which already was done in find. So question: how to rewrite above example in more efficient way?