differentmatt / filbert

JavaScript parser of Python
Other
133 stars 27 forks source link

Worked on builtin functions a little. #1

Closed Xavion3 closed 10 years ago

differentmatt commented 10 years ago

I'll take a look at these changes later today. Thanks!

differentmatt commented 10 years ago

Nice work!

Can you add some ascii() test coverage to go along with the implementation?

Other comments/questions are inline.

Xavion3 commented 10 years ago

Okay there's the fix. Ascii relies on a working repr though and that's currently broken, it will need the type system revamped to be more in line with pythons to work correctly. I'll keep working through the builtin functions but I'll probably need to actually ask you to identify where some parts of what I need are. As an aside how do you raise errors? I know you do it with things like the reference errors but is it just the same for the builtins?

differentmatt commented 10 years ago

ascii()

charCodeAt will always return a value less than 65536. Python supports UTF-32 strings, while JavaScript only supports UTF-16. To achieve UTF-32, you can pair two UTF-16 characters. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt

Errors

We currently raise parse errors, but don't do anything for runtime errors. try/except is not supported yet, and would be a good place to address this. I haven't looked into this yet, and would welcome your thoughts.

Xavion3 commented 10 years ago

Forgot about the problem with the test. It's fixed now though. I've fixed the utf-32 issue for now but we'll need to deal with it counting them as two chars for length and indices later.

As for error handling the steps would be ask nick what he thinks, check how the reference errors and the like are currently being detected, ask nick again, entirely reimplement pythons error detection and stack tracing in JavaScript. That last one is quite huge but also invaluable for debugging python as the js error detection will be shoddy at best. The main thing is figuring out how it needs to be returned, the ast already gives us the line numbers so we could do it with a bit of effort. There is bigger priorities though.

differentmatt commented 10 years ago

I don't see your latest fixes for the test and utf-32 issue. Am I missing a new commit?

Agreed, there are higher priority items than error handling.

Xavion3 commented 10 years ago

I just had to get home first to sync. They've been added back into their appropriate times so just scroll up a bit. It also included my spiffing up of repr so that it should work correctly. Unfortunately it also needs a lot more support in the types before it will work correctly.

I've started on implementing python numeric types, the first part is a bit slow as it's getting complex numbers working.

nwinter commented 10 years ago

For runtime error handling, if you're just looking at it from CodeCombat's point of view, there's no need to implement stack traces or range information in runtime errors. CodeCombat already logs what statement it is about to do before it does it, so if it catches a runtime error, it just says, "Well, it happened on that statement." and highlights it.

If you throw runtime errors, make sure to give them fields like {kind: "TypeError", message: "TypeError: unsupported operand type(s) for +: 'int' and 'list'"}.

For try/catch handling, then, you can probably just rename try/except/finally to try/catch/finally, change the control flow a bit to handle the else and when an error of the wrong kind is inadvertently caught and needs to be rethrown (since JS can only catch all exceptions, not just certain exceptions like Python).

differentmatt commented 10 years ago

Thanks for the tips Nick. It sounds like we should focus on other pieces before addressing runtime error handling. A lightweight renaming of try/except/finally, etc. as you suggest sounds like the right first step when we get there.

differentmatt commented 10 years ago

Here is some extra test coverage I was using while checking out your changes. Feel free to use as you see fit.

  it("ascii()", function () {
    var code = "\
    return ascii(\"TEST\123Ôሴ۞𨭎\")\
    ";
    expect(util.run(code)).toBe('TESTS\\xd4\\u1234\\u06de\\U00028b4e');
  });

  it("ascii() astral plane high mid", function () {
    var code = "\
    return ascii('B\ud862AIL')\
    ";
    expect(util.run(code)).toEqual('BAIL');
  });

  it("ascii() astral plane high end", function () {
    var code = "\
    return ascii('BAIL\ud862')\
    ";
    expect(util.run(code)).toEqual('BAIL');
  });
Xavion3 commented 10 years ago

That seems correct, the second and third tests should not be tested for yet, we don't have support for directly inserting high/low surrogates because we are currently unable to tell the difference between them and ucs 2 pairs. Just test the first one, I'd also say stick to the use of escape codes as I did originally, just makes it safer across systems where utf-32 might not be supported in literals correctly for javascript.

differentmatt commented 10 years ago

That all makes sense.

Python ascii() returns a nested string: ascii('hi') equals "'hi'". Should we have the same behavior?

The remaining items seem to be:

  1. Fix repr()
  2. Fix ascii() test
Xavion3 commented 10 years ago

Oh right. The fix with repr that you proposed is fine. There will always be repr on our provided types and class should be created by us on all of their classes so they should exist. I'd say raise a warning actually so that we catch if it ever fails to have either. I'll quickly submit a fix for the repr as you suggested a patch for the string detection until we have the type system setup correctly to work to support it.

Xavion3 commented 10 years ago

Alright the newest one finally works correctly. Just feel free to pull it at any time.