Closed driskell closed 9 years ago
Sorry missed a commit off - now complete and fully tested.
This looks great, thanks.
As you mentioned in #20, I'm thinking how this could be used to refactor re2_matchdata_aref
(and therefore re2_matchdata_nth_match
).
I've pulled your match
fix into 0.6.1 and made some other tweaks to get YARD documentation working again: might be worth rebasing with master
.
OK I've implemented it to work with multibyte. It's having to create a temporary string object to do so. A future revision could look at trying out rb_str_sublen but my first attempt was unsuccessful.
Thanks for this; I was wondering if there was a way to count the characters purely in C/C++ rather than Ruby but solutions such as http://www.daemonology.net/blog/2008-06-05-faster-utf8-strlen.html assume that the underlying strings are UTF-8 encoded which we can't guarantee so it makes sense to let Ruby do the heavy lifting.
Yes - I planned from the start to let Ruby take care of it, since it is guaranteed to work if it works in Ruby itself. I had hoped, however, that I could do it zero-copy - but it seems rb_str_sublen didn't like something. But functionality first, performance after I think.
If there's anything else you think needs doing before merge, please let me know. When I get some more time later I may re-visit the rb_str_sublen possibility.
Sorry about the delay getting this merged: I'm hoping to take a look at resolving the merge conflicts and rebasing with master this weekend. Did you have any luck with rb_str_sublen
?
Closing in favour of #22 which is now rebased against master.
Hi @mudge
Here's the begin() and end() implementation for the MatchData. Let me know if it's OK!
Jason