IoLanguage / io

Io programming language. Inspired by Self, Smalltalk and LISP.
http://iolanguage.org
Other
2.67k stars 299 forks source link

Unexpected behavior of Object doMessage #438

Open ales-tsurko opened 4 years ago

ales-tsurko commented 4 years ago

I'm trying to implement a basic JSON parser in Io to get rid of parson dependency. It looks like a very easy task, but I faced a problem, which just blows my mind... Here's the code:

Json := Object clone do (
    squareBrackets := Object getSlot("list")

    curlyBrackets := method(
        map := Map clone
        call message arguments foreach(arg,
            "arg is '#{arg}'" interpolate println
            pair := Sequence doMessage(arg)
            "pair is '#{pair}'" interpolate println
            map atPut(pair at(0), pair at(1)))
        map)

    Sequence : := method(
        "Sequence ':' message is '#{call message}'" interpolate println
        list(self, Json doMessage(call message last)))
)

Json clone := Json

When I use this "library" in a script:

Importer addSearchPath("io")

m := Json { "h": 1 }

I get:

arg is '"h" : 1'
Sequence ':' message is ': 1'
pair is '1'

  Exception: argument 0 to method 'atPut' must be a Sequence, not a 'Number'
  ---------
  atPut                               Json.io 10
  Json curlyBrackets                   tes.io 3
  CLI doFile                           Z_CLI.io 140
  CLI run                              IoState_runCLI() 1

As you can see the Sequence's : method is called when it should and the message which this method gets is correct, but for some reason the result is always the last message in the chain. It's ignoring everything I return from Sequence :. I tried a hard-coded value as well but no luck. What am I doing wrong?

stevedekorte commented 4 years ago

The : is not an operator by default in Io, which means in:

h : 1

1 is not an argument of the colon. So it evaluates : which returns the pair and then continues to send the next message of 1 to that pair, which evals to 1 as numbers are cached messages.

It can help to look at the message structure:

Io>  """Json { "h": 1 }""" asMessage code
==> Json curlyBrackets("h" : 1)

There is an operator table which can be used to adjust the list of operators and their precedence (looks like we need to add this to the docs).

https://www.generacodice.com/en/articolo/946609/How+do+I+define+my+own+operators+in+the+Io+programming+language%3F

Here's a version using the global OperatorTable:

OperatorTable addOperator(":", 0) // might not be desired precedence

"""Json { "h": 1 }""" asMessage code println // make sure this is parsed right

Json := Object clone do (
    squareBrackets := Object getSlot("list")

    curlyBrackets := method(
        map := Map clone
        call message arguments foreach(arg,
            "arg is '#{arg}'" interpolate println
            pair := Sequence doMessage(arg)
            "pair is '#{pair}'" interpolate println
            map atPut(pair at(0), pair at(1)))
        map)

        Sequence setSlot(":", method(
            "Sequence ':' message is '#{call message}'" interpolate println
            "Sequence ':' self is '#{self}'" interpolate println
            "Sequence ':' call message arguments first is '#{call message arguments first}'" interpolate println
            list(self, Json doMessage(call message arguments first))
        )
    )
)

Json clone := Json

Importer addSearchPath("io")

doString("""m := Json { "h": 1 }""") // need to doString because we define : operator in the same source file

which evals as:

Json curlyBrackets("h" :(1))
arg is '"h" :(1)'
Sequence ':' message is ':(1)'
Sequence ':' self is 'h'
Sequence ':' call message arguments first is ':(1)'
pair is 'list(h, 1)'

Btw, as the OperatorTable is global, I'd suggest a JSON parse(aString) method which uses Io to get a message structure and then manipulates it and evals it for the desired result, unless you really want to mix JSON syntax with Io code.

ales-tsurko commented 4 years ago

Thanks, Steve! I'd started my implementation using OperatorTable like you're suggesting, but then I found that I can't add an operator for a specific context/namespace and it becomes a global thing. Is it possible to make this operator visible only in Json context? Or maybe is there another way to do this? I'd like to replace the parser, which Io currently have (based on parson) with the one based on Io. Adding operator means that will change OperatorTable for the language itself, which is not good.

UPD

Btw, as the OperatorTable is global, I'd suggest a JSON parse(aString) method which uses Io to get a message structure and then manipulates it and evals it for the desired result, unless you really want to mix JSON syntax with Io code.

Hmm... I'll try it this way. It plays nice with current Sequence parseJson.

stevedekorte commented 4 years ago

It would be cool to support context specific parsing but I never got around to adding that as Io's parser wasn't really designed to be general, though it is remarkable how well it's able to handle turning most languages into a message tree which can then be massaged into the correct form. IIRC, Jeremy Treguna even wrote a C parser/evaluator that did this.

ales-tsurko commented 4 years ago

Yeah, this is a really cool feature. I've seen others implementations of DSL's in Io including this C one. This is inspirational. Add a JSON parser just in about 20 lines of code... And this is actually without any overhead, without "parsing" in a traditional way, it's more like just definition of a syntactic sugar to Io itself. This feature also inspired me to add at least basic type checker to Io. But currently I'm involved in doing Io 1.0.0 and a lot of work is going to be done related to Eerie too, so I don't have much time for this at now...

The more I work with current implementation of Io, the less I want to rewrite it in Rust. I'd rather concentrate my efforts on improving what already exists. And providing an interface to write addons in Rust. Still thinking about that.

Back to the problem... I've just found Sandbox. If I modify the OperatorTable inside a Sandbox, will it change the OperatorTable of current Io instance? And whether this method has any overheads?

stevedekorte commented 4 years ago

I had forgotten about Sandbox! I think that should work, but I'm not sure about the limitations on the returned references. I guess Thread could also be used in this way but you'd need to implement some communication and synchronization for it.

stevedekorte commented 4 years ago

Btw, it would be great to have an interactive debugger in situations like these. I wonder how hard it would be to get that working with MS Code.

stevedekorte commented 4 years ago

On OperatorTable, it might be nice if Compiler could have instances which each had their own operator table.

stevedekorte commented 4 years ago

Is there a repo for the Io in Rust project? I'd like to check it out.

zephyrtronium commented 4 years ago

This feature also inspired me to add at least basic type checker to Io. But currently I'm involved in doing Io 1.0.0 and a lot of work is going to be done related to Eerie too, so I don't have much time for this at now...

The more I work with current implementation of Io, the less I want to rewrite it in Rust. I'd rather concentrate my efforts on improving what already exists. And providing an interface to write addons in Rust. Still thinking about that.

Not to hijack the discussion here, but making : always an operator in Io 1.0.0 sounds great in general to provide a familiar syntax for type checking. Consider:

T foo := method(bar: T,
    self: T
    bar
)

Then Object method could check for next messages on the argument names, which are currently ignored, to define type information for the arguments. We can simply do

Object : := method(type,
    if(self isKindOf(type),
        self,
        Exception raise("type mismatch")
    )
)

so that self: T asserts self to be of type T dynamically, and Block call could be made to do similar checking. Then, for the JSON decoder, we can simply make a clone of Sequence that overrides : rather than redefining : on Sequence itself.

Doing this would have other implications, e.g. we might want Rational to be a Number clone so things can play nicely with either. There might also be scary consequences to redefining :. Also, it might be more useful to have : check for inclusion of a set of slots rather than a particular prototype.

stevedekorte commented 4 years ago

Originally, colon was treated as non-special character so we could have message names containing colons which would be convenient for Objective-C proxy calls. For example, we can call a method like this:

aPoint x:y:(1, 2)

which would map to this on the Objective-C side to the equivalent of:

[aPoint y:1 y:2]

without having to translate the method signature name. I don't think the ObjcBridge ever got much use though, so it's less of a concern now but something to consider.

ales-tsurko commented 4 years ago

@stevedekorte

I had forgotten about Sandbox! I think that should work, but I'm not sure about the limitations on the returned references. I guess Thread could also be used in this way but you'd need to implement some communication and synchronization for it.

I'll try then.

Btw, it would be great to have an interactive debugger in situations like these. I wonder how hard it would be to get that working with MS Code.

About two years ago I saw a GUI debugger in Io. Probably that was an example made by you. I hadn't brought it to live then because of some dependencies issue.

There's a big lack in the language tooling in Io. I even couldn't find a good syntax plugin for vim (forked one and made it better, BTW: https://github.com/ales-tsurko/vim-io) That would be great to have things like linters and language servers, not only debugger.

On OperatorTable, it might be nice if Compiler could have instances which each had their own operator table.

Maybe I'll look into it, when I have more time.

Is there a repo for the Io in Rust project? I'd like to check it out.

Oh, sorry I was not clear. I haven't started this project yet. Just thinking about it. But it's already the third week I'm working on improvements for current (C) implementation of Io and Eerie. Basically, there's not much for Io: fix compilation issues, going to replace JSON parser as I've noticed here, configured CI (including Windows), maybe will configure CD too. For Eerie there's a whole new version: full refactoring, tests, a lot of fixes, ridding of environments and replacing them to "in-project dependencies" (similar to node_modules with NPM or target with Cargo), and more. I'll open a PR when it's done.


@zephyrtronium

I was thinking about something similar for types syntax! But I'd put it in a namespace instead of reserving anything. So it would be possible to enable typing just for some parts or enable/disable it at runtime. Also, I'd make it a library.

There might also be scary consequences to redefining :.

Such consequences might have a place with anything in Io :smile:

ales-tsurko commented 4 years ago

The : is not an operator by default in Io, which means in:

h : 1

1 is not an argument of the colon. So it evaluates : which returns the pair and then continues to send the next message of 1 to that pair, which evals to 1 as numbers are cached messages.

It can help to look at the message structure:

Io>  """Json { "h": 1 }""" asMessage code
==> Json curlyBrackets("h" : 1)

There is an operator table which can be used to adjust the list of operators and their precedence (looks like we need to add this to the docs).

https://www.generacodice.com/en/articolo/946609/How+do+I+define+my+own+operators+in+the+Io+programming+language%3F

Here's a version using the global OperatorTable:

OperatorTable addOperator(":", 0) // might not be desired precedence

"""Json { "h": 1 }""" asMessage code println // make sure this is parsed right

Json := Object clone do (
    squareBrackets := Object getSlot("list")

    curlyBrackets := method(
        map := Map clone
        call message arguments foreach(arg,
            "arg is '#{arg}'" interpolate println
            pair := Sequence doMessage(arg)
            "pair is '#{pair}'" interpolate println
            map atPut(pair at(0), pair at(1)))
        map)

      Sequence setSlot(":", method(
          "Sequence ':' message is '#{call message}'" interpolate println
          "Sequence ':' self is '#{self}'" interpolate println
          "Sequence ':' call message arguments first is '#{call message arguments first}'" interpolate println
          list(self, Json doMessage(call message arguments first))
      )
  )
)

Json clone := Json

Importer addSearchPath("io")

doString("""m := Json { "h": 1 }""") // need to doString because we define : operator in the same source file

which evals as:

Json curlyBrackets("h" :(1))
arg is '"h" :(1)'
Sequence ':' message is ':(1)'
Sequence ':' self is 'h'
Sequence ':' call message arguments first is ':(1)'
pair is 'list(h, 1)'

Btw, as the OperatorTable is global, I'd suggest a JSON parse(aString) method which uses Io to get a message structure and then manipulates it and evals it for the desired result, unless you really want to mix JSON syntax with Io code.

Just tested this code.

I didn't explain how my code is organized. I have an io directory inside my project directory. The first code block in my initial message here is inside io/Json.io. And the second block is my test file which is at ./test.io (that's why I have an Importer addSearchPath("io")). So Json is supposed to be imported.

I've replaced the content of io/Json.io with the code you suggested. And replaced the code of ./test.io to:

Importer addSearchPath("io")

OperatorTable println # < - note this
Json OperatorTable println # and this

m := Json doString("""{ "h": 1 }""")

The part of the output is:

OperatorTable_0x55a2a31dad30:
Operators
  0   ? @ @@
  1   **
  2   % * /
  3   + -
  4   << >>
  5   < <= > >=
  6   != ==
  7   &
  8   ^
  9   |
  10  && and
  11  or ||
  12  ..
  13  %= &= *= += -= /= <<= >>= ^= |=
  14  return

Assign Operators
  ::= newSlot
  :=  setSlot
  =   updateSlot

To add a new operator: OperatorTable addOperator("+", 4) and implement the + messag
e.
To add a new assign operator: OperatorTable addAssignOperator("=", "updateSlot") an
d implement the updateSlot message.

Json curlyBrackets("h" :(1))
OperatorTable_0x55a2a31dad30:
Operators
  0   : ? @ @@
  1   **
  2   % * /
  3   + -
  4   << >>
  5   < <= > >=
  6   != ==
  7   &
  8   ^
  9   |
  10  && and
  11  or ||
  12  ..
  13  %= &= *= += -= /= <<= >>= ^= |=
  14  return

So it looks like this way the operator definition is actually doesn't touch the global context! So that's it. Just need to make to hide this doString to make it possible to use as a DSL in the context of Json object.

UPD

Ah, no. That's because I didn't bring Json context before Json OperatorTable call. The operator is global.

stevedekorte commented 4 years ago

Might not be too hard to have the OperatorTable slot looked up via the context in which the String is parsed. IIRC, there is a context other than IoState, but I could be wrong.

zephyrtronium commented 4 years ago

I just remembered that messages already check their own OperatorTable slot and only default to the global one if a local one isn't found. Usually it would find the one on the Message proto, which is assigned by the OperatorTable initialization code. (At least, this is what my Go implementation does; presumably I was stealing reading the original implementation while writing it.)

stevedekorte commented 4 years ago

Would be neat to expose the Lexer within Io, so aspects of token definitions could be modified. Just a little bit of control might go a long way. I guess we could just implement an Io lexer in Io for this as it wouldn't be much code.

ales-tsurko commented 4 years ago

Isn't it already possible with Compiler? Or do you mean to modify/add tokens itself before parsing (like adding new type of tokens to the lexer)? Something like OperatorTable but for tokens?

stevedekorte commented 4 years ago

Isn't it already possible with Compiler? Or do you mean to modify/add tokens itself before parsing (like adding new type of tokens to the lexer)? Something like OperatorTable but for tokens?

Yes, I meant being able to inspect and change the grammar (or some instance of the grammar) http://iolanguage.com/guide/guide.html#Appendix-Grammar

ales-tsurko commented 4 years ago

In the existing infra it seems like a lost feature :smile: I can only imagine how that would allow libraries for the syntax itself. Plugins for the programming language... Yeah, there're already things like JSX, but be able to do such things at runtime in such a simple manner is something new to me.