maingene / pyv8

Automatically exported from code.google.com/p/pyv8
0 stars 0 forks source link

Support Unicode JavaScript Source for JSExtension #152

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

>>> import PyV8
>>> PyV8.JSExtension('test', u';')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Boost.Python.ArgumentError: Python argument types in
    JSExtension.__init__(JSExtension, str, unicode)
did not match C++ signature:
    __init__(_object*, std::string name, std::string source, boost::python::api::object callback=None, boost::python::list dependencies=[], bool register=True)

What is the expected output? What do you see instead?

Support similar to JSContext:

>>> with PyV8.JSContext() as ctx: ctx.eval(u";")
... 
>>>

What version of the product are you using? On what operating system?

python-pyv8 (1.0-~svn470+13384)
Linux zunca 3.5.0-22-generic #34-Ubuntu SMP Tue Jan 8 21:47:00 UTC 2013 x86_64 
x86_64 x86_64 GNU/Linux

Perhaps this is a duplicate for #75

Original issue reported on code.google.com by e.generalov on 18 Jan 2013 at 3:59

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago

Original comment by flier...@gmail.com on 19 Jan 2013 at 1:25

GoogleCodeExporter commented 9 years ago
I have added JSExtension with unicode name and source after the SVN trunk r472, 
but v8 doesn't support the extension other than ASCII, it means even pyv8 could 
use unicode name and source, but v8 will failed if you pass a real unicode 
source other than ASCII :( 

Please submit an issue to the v8 project if you real need it support Unicode 
extension. 

Thanks

Original comment by flier...@gmail.com on 19 Jan 2013 at 2:51

GoogleCodeExporter commented 9 years ago
As workaround we could to escape unicode symbols with JavaScript escape 
sequences, before passing to JSExtension.

ext = JSExtension(name, js_escape_unicode(jsource))

import re

ESCAPABLE = re.compile(r'([^\x00-\x7f])')
HAS_UTF8 = re.compile(r'[\x80-\xff]')

def _js_escape_unicode_re_callack(match):
    s = match.group(0)
    n = ord(s)
    if n < 0x10000:
        return r'\u%04x' % (n,)
    else:
        # surrogate pair
        n -= 0x10000
        s1 = 0xd800 | ((n >> 10) & 0x3ff)
        s2 = 0xdc00 | (n & 0x3ff)
        return r'\u%04x\u%04x' % (s1, s2)

def js_escape_unicode(s):
    """Return an ASCII-only representation of a JavaScript string"""
    if isinstance(s, str):
        if HAS_UTF8.search(s) is None:
            return s
        s = s.decode('utf-8')
    return str(ESCAPABLE.sub(_js_escape_unicode_re_callack, s))

Original comment by e.generalov on 21 Jan 2013 at 6:36

GoogleCodeExporter commented 9 years ago
Thanks for your patch, I have moved unicode support from C++ to Python side, 
please verify it with SVN trunk code after r477

Original comment by flier...@gmail.com on 2 Feb 2013 at 2:03