sindresorhus / project-ideas

Need a JavaScript module or looking for ideas? Welcome ✨
544 stars 9 forks source link

Bash AST parser #61

Open dthree opened 8 years ago

dthree commented 8 years ago

I'm looking for someone knowledgeable in PEGs (DSL parser), and specifically PEG.js that would like to help finish a project.

js-shell-parse is a dropped project that is almost complete, which parses bash into an AST. I am trying to fork and complete this project, which would be the foundation of implementing a full cross-platform bash interpreter into Vorpal and Cash.

Doing this would allow you to build cross-platform, interactive CLIs with full bash support (all redirections, substitutions, control words, etc). If this is accomplished, it is very likely that Cash (and the module being discussed) will ship with NPM in the future as the package script interpreter.

Any PEG masters out there interested?

sindresorhus commented 8 years ago

// @wooorm @qix-

iiison commented 8 years ago

I've not worked on PEG.js, but up for it... Can I help...

dthree commented 8 years ago

Sure, as long as you can take the time make sense out of this:

https://github.com/grncdr/js-shell-parse/blob/master/grammar.pegjs

Parsing DSLs is not easy and takes a lot of study, but if you're up to it I would love your help.

dstack commented 8 years ago

willing to take a look, have worked with PEG.js in the past, parsed several languages. The real question is what is missing?

dthree commented 8 years ago

Will compile a list shortly.

iiison commented 8 years ago

@dthree will take a look.

dthree commented 8 years ago

:+1:

forivall commented 8 years ago

I'll also be taking a look; i worked with flex / bison back in uni, and i'd love to delve into pegjs

parro-it commented 8 years ago

I was working on a project similar to cash one year ago.

I used Jacob for the grammar stuff. I started trying to make sense of an existing grammar, but later I decided to start from scratch because it was too much stuff to make sense of...

Later on I abandoned the project because it was too big to do it alone. I'll be more then happy to help you :smiley_cat: (but I'm far to be an expert of grammars).

As fair as I remember, the project was able to parse some basic commands and operators

| || && > >>

If you think it could be useful, I can move the source to GH. It is actually on a private repo on bitbucket.

dthree commented 8 years ago

@parro-it nice!

js-shell-parse is actually really close, so I think the best efforts would be to focus on wrapping it up.

parro-it commented 8 years ago

I agree with you @dthree. I didn't understood if you plan to contribute to the js-shell-parse repo or if you created a new one from that code. Did you try to contact @grncdr?

Qix- commented 8 years ago

I can help, but I'm packing for my move today.

Or should I say, packratting (I'll see myself out...)

dthree commented 8 years ago

@parro-it I attempted and its radio silence. There's also been issues filed over time and no response in two years. Will probably fork and then keep him in the license.

dthree commented 8 years ago

@Qix- nice - maybe after move? Where you going? :)

Qix- commented 8 years ago

Sure. And I move to San Fran tomorrow.

parro-it commented 8 years ago

@dthree I think that's fair, I also saw the issues... I'm looking now at the grammar, it's not simple but could be afforded.

dthree commented 8 years ago

@Qix- sweet!

@parro-it :+1:

grncdr commented 8 years ago

cool, so I'm super stoked if somebody wants to fork & finish js-shell-parse, but I should warn you that the operational semantics of posix shells (or indeed most streaming input) does not mesh super well with PEG. There's a hack for this in https://github.com/grncdr/js-shell-frontend where it looks at the PEG.js syntax errors and figures out if you can continue parsing with more input.

TBH if I were to start again, I would look at doing this with a streaming parser-combinator library. PEG is super nice for relatively simple languages where you can expect to parse all input at once, but this is not how shells work. The semantics and syntax of POSIX shells are deeply intertwined, so spend some time reading the POSIX specs before you commit to resurrecting js-shell-parse.

Some of these things might be non-issues if your goal is just to execute npm scripts, if that's all you want js-shell-parse is probably good enough already. However, if you want to do interactive shells, your also going to need to drop down into C for doing job control and properly managing the TTY. I had a PoC interactive shell based on shell-frontend at some point, but I can't seem to find the repo for it now. In any case, a lot of things didn't really work right, and I ran out of enthusiasm for the project. If you want to know more just ask here.

dthree commented 8 years ago

@grncdr you're alive! :tada:


Thanks for your advice and understood on all of it.

There's a lot right about js-shell-parse as well. The current priority on support is things like redirection, expansions, variables and basic flow control. You seem to have these things pretty taped.

From what I can tell, what you're having trouble with is more advanced flow control (functions, if / else, loops, etc.). Am I correct? These things are less a priority at the moment, as, like you said, the main public for this is a. package scripts, and b. an interactive shell (single liners).


It would be amazing if you could possibly do a little turn-over on the repo, perhaps by listing what types of things are implemented and what was giving you unsolvable trouble. You're loaded with experience and this would save a lot of time!

grncdr commented 8 years ago

From what I can tell, what you're having trouble with is more advanced flow control (functions, if / else, loops, etc.). Am I correct? These things are less a priority at the moment, as, like you said, the main public for this is a. package scripts, and b. an interactive shell (single liners).

Not exactly, most of the parsing problems are relatively straightforward and while I'm sure there's bugs or missing things most of them should be pretty easy to add to the parser (e.g. the requested >| redirection). Where things got really stuck is implementing an interactive shell properly, most of the posix spec is written in terms of a streaming/character based parser.

It would be amazing if you could possibly do a little turn-over on the repo, perhaps by listing what types of things are implemented and what was giving you unsolvable trouble. You're loaded with experience and this would save a lot of time!

Would you be up for setting up & recording a google hangout or skype call this weekend? I'm free Sunday, and that would give me some time to look over the grammar & code again. Recording it means that you have something a bit more concrete to refer back to, but I won't need to spend quite so much time as I would if I were to write something.

gabrielcsapo commented 7 years ago

Has anyone gotten to a point of generating an AST from a bash script and then being able to generate code from that AST?

Qix- commented 7 years ago

Oh boy this got lost by the wayside. Really wishing I had @remindme right now.

gabrielcsapo commented 7 years ago

@Qix- I built https://github.com/gabrielcsapo/shell-p to try and get introspection on the running time of various parts of a shell script, but js-shell-parser and bash-parser seem to have issues with 1. parsing function and 2. generating code form an AST

parro-it commented 7 years ago

@gabrielcsapo I answer you issue on bash-parser repo there.

Regarding 1., if you need to parse bash specific function syntax, there is a task list issue keeping track of bash specifics syntax to implement. I could prioritize the function syntax implementation if you need it.

parro-it commented 7 years ago

@Qix- we discussed the status of the project with @dthree here if you are interested :smile_cat: The project is really difficult and we need more help !!

gabrielcsapo commented 7 years ago

@parro-it that would be awesome if you could prioritize that! I will take a look at the issue in the morning and work on generating code from the AST generated by bash-parser

mvdan commented 6 years ago

I'm a bit late to the party, but I built this very thing a couple of years ago: https://github.com/mvdan/sh

It contains a parser (code to AST), a printer (AST to code), and even an experimental interpreter to run an AST. All of those, with full support for POSIX Shell, Bash and mksh. It's been battle-tested with lots of code and edge cases for over two years, so I'd be surprised if you could find any real code to make it fail :)

I know it's written in Go, but transpiling Go to JS has been done before: https://github.com/gopherjs/gopherjs

It also seems like the language will get wasm support soon, so that might make things even easier: https://github.com/golang/go/issues/18892

If any of you would like to quickly play with it, see shfmt -tojson some-file.sh.

stefanmaric commented 6 years ago

I'm a bit late to the party, but I built this very thing a couple of years ago: https://github.com/mvdan/sh

Random, off-topic anecdote: I got a notification for this comment about 1 hour after I stared that repo while I was looking for a bash formatter for a go version manager - small world. :smile:

parro-it commented 6 years ago

It happens I give you a star long time ago... I guess I did't consider to use it bevause go. Anyway, I am really curious to see how you solve the problems I encontered developing bash-parser. Did you follow the Poslx standard in a strict way?

parro-it commented 6 years ago

BTW, giur anyone interested bash-parser is here https://github.com/vorpaljs/bash-parser But I fear you can find MANY bugs there...

mvdan commented 6 years ago

Did you follow the Poslx standard in a strict way?

Yes. See the caveats section of the README for the few tradeoffs I had to make. Otherwise, you can assume that POSIX and Bash are both fully supported.

I'll try to have a simple javascript module published by this weekend, with a subset of the API exposed to do the basic stuff.

mvdan commented 6 years ago

I believe I have a working version of the parser in a JS module: https://www.npmjs.com/package/mvdan-sh

This is the glue code and the package.json: https://github.com/mvdan/sh/tree/master/_js

It should work, as the testmain.js file in there works with the index.js that is packaged in the module. However, I have no idea how to actually use a JS module, so I'm not 100% sure that the module will just work. Please give it a go and let me know.

I'll add more parts of the Go library to the JS module soon, as well as better docs :)

mvdan commented 6 years ago

I have been working on this over the last couple of weeks. Now the JS package is at a stage where it's useful. You can get the parse tree, print it out for debugging purposes, walk it, modify it, and convert it into source code again.

See the README on the npm package link I pasted above; it contains a sample JS program that shows all of the above. Hopefully that is enough to get people started.

If you find any bugs, or any features are missing, please raise issues on the mvdan/sh repository. Thanks!