wenyan-lang / wenyan

文言文編程語言 A programming language for the ancient Chinese.
https://wy-lang.org/
MIT License
19.59k stars 1.1k forks source link

[Help wanted] How can I contribute to your project by implementing new python bindings/compilation tools? #307

Open Lotayou opened 4 years ago

Lotayou commented 4 years ago

Hi @LingDong- , I've been playing around with your wen-yan lang for quite a while. It's really a lot of fun and gives me a great sense of pride and achievement to code in my mother language. Besides, it also give me a few insight on the development of programming languages and compilers.

Now I notice your project is mainly based on javascript and does not contain many pythonic features. I wonder if I can start implementing a python compiler that is compatible with the wenyan syntax. In this way I can freely experiment on some new language sugars #301 #285 and maybe introduce some useful bindings that are exclusive in python, like pickle #286, functools #155 , np.ndarray #300 and such.

Now I read your compiler source under build/wenyan.js and the coding style is emmm... a bit confusing for a python user like me (since indentation is also part of the syntax for python). All I can found was a predefined pylib string containing a serialized class defintion of Ctnr which I suppose is the built-in array object? I was hoping I could do something more than that.

I know currently you must be busy implementing new features while applying for Ph. D. However, I would be grateful if you can spare sometime and tell me about the main steps I need to do follow in order to develop new python bindings or even implementing my own python compiler. Thanks!

ijklang commented 4 years ago

The wenyan-to-python compiler's source code is /src/asc2py.js,/build/wenyan.js is all source code complied version.

LingDong- commented 4 years ago

Hi @Lotayou , Thanks so much for your interest!

oovm commented 4 years ago

@Lotayou 如果你想用 python 写 parser 的话可以试试我这个 wenyan-parser-python

需要的依赖有:

pip install antlr4 unittest astunparse

如果你用 vscode 要自己配一下 settings

如果你用 pycharm/idea 的话项目只要建在 packages/wenyan-parser-py 就行了, 其他都配好了

然后你看 test_ast_print 函数能跑的话就没问题了

基本原理就是 wenyan-ast 逐句翻译到 python-ast

有其他什么其他问题的话可以开 issue 问我

Lotayou commented 4 years ago

First of all, thanks everybody for your quick reply!

@t-a-n-0 Thanks for the info. Now I see the parser file isn't as long as I expected, which is helpful for me to learn how you implement a parser (I didn't learn about that in college since CS is not my B.S. major:)

@LingDong- and @GalAster Thanks for the explanation. I used to thought a parser is like finding'n'replacing keyword pairs from source to target language. Now I know I was wrong :P

What I gather now is, to build a parser, we need to first define a unified syntax system called AST (abstract syntax tree), which is kinda like the Rosetta Stone for programming languages. Then we need to generate the corresponding syntax tree for a given wenyan program so we can translate to corresponding python code that would evaluate back to (or equivalent with) this syntax tree.

Now I'm not sure if I need to fully understand AST before I can start adding new features, but I'll start from reading the src/asc2py.js and WenyanLanguage and trying to get the hang of this thing. Hope you guys don't mind me bothering you with a bunch of newbie questions :P

Also, @LingDong- Is it okay that I implement the wenyan-to-AST part in python and make it a separate src/asc2py.py? I never wrote in java/javascript before so I'm not gonna work on src/asc2py.js and mess with your workflow. Is it too hard for you to compile the final build/wenyan.js from a python file? If that's not an option, I think I'll build upon WenyanLanguage, and leave @GalAster to integrate these changes into your main repo.

liaocm commented 4 years ago

@Lotayou Let me give you some context on how a programming language and a compiler works, so you can contribute to this project with better understanding. AST generation is actually part of the compilation or interpretation process, not what defines a programming language. What defines a language is called grammar. You can refer to documentation/syntax.txt for the grammar of wenyan-lang. Note that this is still WIP.

In order to turn a piece of code snippet into machine executables, a few steps need to happen first:

  1. lexical analysis (tokenize)
  2. parsing
  3. static semantics analysis (if any)
  4. optimization to AST (if any)
  5. code gen
  6. optimization to generated code (if any)
  7. profit!

Step 1 turns strings into a list of tokens. e.g. "int a = 10;" ==> ['int', 'a', '='. '10]. This is called "tokens". Step 2 turns tokens into AST. e.g. the result above ==> (define-assign int a 10). With this we can either continue to process the AST and interpret / compile the code, or trans-pile it into another language (like this project is currently doing).

Want to learn more about programming language? There are tons of resources online! For example, following some college classes reading material is a good way to learn. Here's a link to UC Berkeley's compiler course: http://inst.eecs.berkeley.edu/~cs164/sp19/

Lotayou commented 4 years ago

@liaocm Thanks for the help!