justinlubin / cobbler

Refactor programs to use library functions!
5 stars 0 forks source link

Ocaml IIR type definition #10

Closed jeremyferguson closed 1 year ago

jeremyferguson commented 1 year ago

In this PR, the AST type for the IIR is defined in OCaml in lang.ml. This replaces the pyre-ast type previously used. This PR also contains stubs for some basic functions for interacting with the AST's and modifies bin/main.ml and the tests accordingly. Finally, there is a basic proof of concept implementation for parsing the source program in Python and translating it into a subset of the IIR written in an s-expression.

jeremyferguson commented 1 year ago

On your notes on binary operators, Index and Call, Assign and For: makes sense, I'll make those changes For env: that makes sense, I guess I was considering function overloading where you can have 2 functions with the same name but a different number of arguments, but I forgot that Python doesn't support that. For ast: I wanted to be able to collect all of the environments into a top-level structure, so that you had one program represented as a list of different environments, with each one being a function, and one environment for __main__ to represent the code outside of a function. I'm open to changing that, though. How do you envision representing an entire program?

jeremyferguson commented 1 year ago

Oh also, on the For and Assign being an expr: I was going off of the Python ast documentation where it says it can be an expr, but I can definitely see why we would want to restrict it to id.

justinlubin commented 1 year ago

On your notes on binary operators, Index and Call, Assign and For: makes sense, I'll make those changes

Great, thanks!

For env: that makes sense, I guess I was considering function overloading where you can have 2 functions with the same name but a different number of arguments, but I forgot that Python doesn't support that.

Ahh, that makes sense! But, yeah, I don't think we need to handle that.

For ast: I wanted to be able to collect all of the environments into a top-level structure, so that you had one program represented as a list of different environments, with each one being a function, and one environment for __main__ to represent the code outside of a function. I'm open to changing that, though. How do you envision representing an entire program?

If I understand correctly, each env is a mapping from identifiers to function bodies (and their parameter lists). So each function definition would constitute a mapping in a single env, and a program would be something like env * block, where the block is the part of the program inside __main__.

Oh also, on the For and Assign being an expr: I was going off of the Python ast documentation where it says it can be an expr, but I can definitely see why we would want to restrict it to id.

Awesome, thank you! Yeah, id is a simplification of reality (it could be something like x, y = 1, 2), but I think according to the Python syntax specification (under for_stmt:), it's still not a full expression, because you couldn't have something like x + y = 3.

I'll take a look at the changes in the next comment!

jeremyferguson commented 1 year ago

Oh, another thing. In the types of programs we're looking at, it's common to have a line like out = np.zeros(n). Do we want to keep that syntax when we parse it, or should there be some list type that we could pre-process it into when we are doing the initial parsing in Python? Do you think this would be useful for our purposes?

justinlubin commented 1 year ago

Oh, another thing. In the types of programs we're looking at, it's common to have a line like out = np.zeros(n). Do we want to keep that syntax when we parse it, or should there be some list type that we could pre-process it into when we are doing the initial parsing in Python? Do you think this would be useful for our purposes?

Ah, hm, good question. I'm not sure if it will be helpful to have a special AST node for numpy function calls or not. I think we could go either way, but perhaps for simplicity we can just replace np.zeros with zeros in our examples and worry about handling np calls later?

Thanks so much for the lengthy discussion on the lang types, by the way! I just want to make sure we get these types down really solid since they're going to be such a central component of the project.

jeremyferguson commented 1 year ago

Sounds good! I've implemented those things, and all checks are passing.