rzimmerman / kal

A powerful, easy-to-use, and easy-to-read programming language for the future.
http://rzimmerman.github.io/kal
MIT License
394 stars 18 forks source link

cjk parse bug #108

Closed bcho closed 11 years ago

bcho commented 11 years ago

Here is a sample project structure:

.
└── src
    └── js
        └── foo.kal

In foo.kal:

class Subject

    method parse(raw)

        _ = function(pattern)
            r = pattern.exec(raw)
            return r[1].trim() if r is not null else ''

        id = /subject\/(\d+)/.exec(location.href)
        id = id[1] if id.length > 1 else null

        return
            id: id
            author: _(/作者:(.*)/)
            publisher: _(/出版社:(.*)/)
            isbn: _(/ISBN:(.*)/)
            publish_time: _(/出版年:(.*)/)

When I run kal -o js/src src/js/foo.kal in the top-level (i.e., in k/), the parser will return this error message:

/usr/lib/node_modules/kal/compiled/kal.js:1
oad(parser.Grammar),root_node.js(options)}catch(e){throw e.message||e}}var sug
                                                                    ^
Expected ',' or ')' on line 14 in file src/js/foo.kal

If I compile under src/js, it won't generate any errors.

And if I change the Chinese character to English, it won't generate any errors either.

rzimmerman commented 11 years ago

@bcho this actually turned out to be an interesting bug. Thanks for reporting it. What was happening was:

Running kal with -o calls kal.compile, where running kal without -o calls kal.eval. Both interfaces read the file and passed in a Buffer object, but kal.eval converted the buffer to a string first. It turns out the lexer chokes on buffer objects, probably due to Unicode characters still being in byte form.

rzimmerman commented 11 years ago

I'll try to put out a release in the next day or two with this fix included.

rzimmerman commented 11 years ago

I pushed this to npm, so give 0.5.1 a try and let me know if you have issues.

bcho commented 11 years ago

It works now, thanks for your quick fix :)