ccxvii / mujs

An embeddable Javascript interpreter in C.
http://mujs.com/
ISC License
813 stars 98 forks source link

String.prototype.replace() with global regex doesn't resolve correctly $` #172

Open dankox opened 1 year ago

dankox commented 1 year ago

Hi, I'm using MuJS version 1.3.2 and I think the String.prototype.replace() doesn't work correctly with global regular expression and replacement pattern "$`". MuJS is using the last found location of the search pattern as a start for the replacement which is preceding the found substring. This also applies to the offset passed to the replacement function parameter.

MuJS version:

> "abcdce".replace(/c/g, " $` ")
"ab ab d d e"
> "abcdce".replace(/c/g, function(match, offset, str) { console.log("offset " + offset); })
offset 2
offset 1
"abundefineddundefinede"

NodeJS version:

> "abcdce".replace(/c/g, " $` ")
'ab ab d abcd e'
> "abcdce".replace(/c/g, function(match, offset, str) { console.log("offset " + offset); })
offset 2
offset 4
'abundefineddundefinede'

I'm not sure if using NodeJS is a good way of verifying the functionality, because I don't know if ES5 had different behaviour defined for this function or not (NodeJS has ES6 support).

If however it is a good way of checking and the error is real, I think it could be solved by this patch:

diff --git a/jsstring.c b/jsstring.c
index 6da9191..d600043 100644
--- a/jsstring.c
+++ b/jsstring.c
@@ -407,12 +407,13 @@ static void Sp_search(js_State *J)
 static void Sp_replace_regexp(js_State *J)
 {
    js_Regexp *re;
-   const char *source, *s, *r;
+   const char *source, *origsource, *s, *r;
    js_Buffer *sb = NULL;
    int n, x;
    Resub m;

    source = checkstring(J, 0);
+   origsource = source;
    re = js_toregexp(J, 1);

    if (js_doregexec(J, re->prog, source, &m, 0)) {
@@ -431,7 +432,7 @@ loop:
        js_pushundefined(J);
        for (x = 0; m.sub[x].sp; ++x) /* arg 0..x: substring and subexps that matched */
            js_pushlstring(J, m.sub[x].sp, m.sub[x].ep - m.sub[x].sp);
-       js_pushnumber(J, s - source); /* arg x+2: offset within search string */
+       js_pushnumber(J, s - origsource); /* arg x+2: offset within search string */
        js_copy(J, 0); /* arg x+3: search string */
        js_call(J, 2 + x);
        r = js_tostring(J, -1);
@@ -447,7 +448,7 @@ loop:
                case 0: --r; /* end of string; back up */
                /* fallthrough */
                case '$': js_putc(J, &sb, '$'); break;
-               case '`': js_putm(J, &sb, source, s); break;
+               case '`': js_putm(J, &sb, origsource, s); break;
                case '\'': js_puts(J, &sb, s + n); break;
                case '&':
                    js_putm(J, &sb, s, s + n);

Maybe there is more to it, but just wanted to provide a starting point.