zysszy / TreeGen

A Tree-Based Transformer Architecture for Code Generation. (AAAI'20)
MIT License
90 stars 26 forks source link

Do you use CSTs or ASTs? #22

Open brando90 opened 2 years ago

brando90 commented 2 years ago

From the description of terminal and non-terminals I'd assume a CST but you call it ASTs. Can you clarify this?

Concrete Syntax Tree vs Abstract Syntax Tree. The latter usually has constructors instead of terminal/non-terminals...

zysszy commented 2 years ago

We use Abstract Syntax Tree (AST). In this tree, all leaf nodes are terminals and other nodes are non-terminals.

brando90 commented 2 years ago

We use Abstract Syntax Tree (AST). In this tree, all leaf nodes are terminals and other nodes are non-terminals.

Not sure what you mean by terminals and non-terminals if you are using ASTs. My understanding is if those are symbols from the grammar then you are using a CST.

If you were using ASTs then you only have programming language constructs at each level. e.g. consider this lambda calc expression

λ x. x+x

That could be represented by the AST:

Abs(Var(x), Add(Var(x), Var(x)))

in this tree there are no terminals and non-terminals.

So for python you would have things like If( ) else () etc.

zysszy commented 2 years ago

We use Abstract Syntax Tree (AST). In this tree, all leaf nodes are terminals and other nodes are non-terminals.

Not sure what you mean by terminals and non-terminals if you are using ASTs. My understanding is if those are symbols from the grammar then you are using a CST.

If you were using ASTs then you only have programming language constructs at each level. e.g. consider this lambda calc expression

λ x. x+x

That could be represented by the AST:

Abs(Var(x), Add(Var(x), Var(x)))

in this tree there are no terminals and non-terminals.

So for python you would have things like If( ) else () etc.

Sorry for late reply.

In our paper, all leaf nodes are terminals. Thus, all x in Abs(Var(x), Add(Var(x), Var(x))) are terminals and other nodes are non-terminals.

brando90 commented 2 years ago

No worries. Thnx for the reply!

Out of curiosity - how did you do the cst to ast conversion? Python’s lark parser? What parser/pylib dod u use?

On Saturday, January 8, 2022, Zeyu Sun @.***> wrote:

Sorry for late reply.

In our paper, all leaf nodes are terminals. Thus, all x in Abs(Var(x), Add(Var(x), Var(x))) are terminals.

— Reply to this email directly, view it on GitHub https://github.com/zysszy/TreeGen/issues/22#issuecomment-1007977593, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOE6LSMWYPAPV4YGR7WB4TUVAVLVANCNFSM5KTOBL2Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

zysszy commented 2 years ago

We use the python lib ast to directly parse the AST.