Distributive-Network / PythonMonkey

A Mozilla SpiderMonkey JavaScript engine embedded into the Python VM, using the Python engine to provide the JS host environment.
https://pythonmonkey.io
Other
854 stars 40 forks source link

String optimization of using utf16 SourceText inside `pm.eval` #446

Open Xmader opened 1 month ago

Xmader commented 1 month ago
index b0cfb09..9c25651 100644
--- a/src/modules/pythonmonkey/pythonmonkey.cc
+++ b/src/modules/pythonmonkey/pythonmonkey.cc
@@ -460,14 +460,26 @@ static PyObject *eval(PyObject *self, PyObject *args) {
   JS::RootedScript script(GLOBAL_CX);
   JS::Rooted<JS::Value> rval(GLOBAL_CX);
   if (code) {
-    JS::SourceText<mozilla::Utf8Unit> source;
-    Py_ssize_t codeLength;
-    const char *codeChars = PyUnicode_AsUTF8AndSize(code, &codeLength);
-    if (!source.init(GLOBAL_CX, codeChars, codeLength, JS::SourceOwnership::Borrowed)) {
-      setSpiderMonkeyException(GLOBAL_CX);
-      return NULL;
+    if (PyUnicode_KIND(code) == PyUnicode_2BYTE_KIND) { // code is in UCS2 encoding, a subset of UTF16
+      JS::SourceText<char16_t> source;
+      Py_ssize_t codeLength = PyUnicode_GetLength(code);
+      Py_UCS2 *codeChars = PyUnicode_2BYTE_DATA(code);
+      if (!source.init(GLOBAL_CX, (char16_t *)codeChars, codeLength, JS::SourceOwnership::Borrowed)) {
+        setSpiderMonkeyException(GLOBAL_CX);
+        return NULL;
+      }
+      script = JS::Compile(GLOBAL_CX, options, source);
+    }
+    else {
+      JS::SourceText<mozilla::Utf8Unit> source;
+      Py_ssize_t codeLength;
+      const char *codeChars = PyUnicode_AsUTF8AndSize(code, &codeLength);
+      if (!source.init(GLOBAL_CX, codeChars, codeLength, JS::SourceOwnership::Borrowed)) {
+        setSpiderMonkeyException(GLOBAL_CX);
+        return NULL;
+      }
+      script = JS::Compile(GLOBAL_CX, options, source);
     }
-    script = JS::Compile(GLOBAL_CX, options, source);
   } else {
     assert(file);
     script = JS::CompileUtf8File(GLOBAL_CX, options, file);

Making the above change would be a good optimization in the case where the argument is already in UCS2 encoding, since then we can use utf16 SourceText.

_Originally posted by @caleb-distributive in https://github.com/Distributive-Network/PythonMonkey/pull/443#discussion_r1774122534_