Initial work on document formatting support for PICO-8 Lua files

beetrootpaul commented 1 year ago

Based on top of cherry-picked 6195b55f133759ed13212047ff478f2bd3b2a806

In context of #26

beetrootpaul commented 1 year ago

It works in general, but there are some bugs to be fixed and features missing. See https://github.com/beetrootpaul/pico8-ls/blob/17e3ce399720b1bbb696ab4b9434468dc850bfcb/server/src/server.ts#L684-L695 for more info

beetrootpaul commented 1 year ago

(answering https://github.com/japhib/pico8-ls/issues/26#issuecomment-1281536879 and the next one there)

I'm not sure how best to commit exactly to your branch, since your branch is a fork of mine, so for now I figure we can just cherry-pick commits back and forth as needed.

I also have no idea how to collaborate effectively across 2 repos, so yes, this makes sense 😄

Here's the commit I made today: https://github.com/japhib/pico8-ls/commit/3aae15c8c45dd0a7e4ee3c1080850c62d3bdf8ca

great, I will take a look soon! 🙂

If you want the same behavior you can install the ESLint VSCode extension, and it should format all the files you work on according to the formatting rules defined for this project in .eslintrc.js.

will do 👍 (although in WebStorm, but yes, can attach ESLint as well)

Here's my progress: (…)

great! 🎉

Still needs work: (…)

thx for info, I will take a look as well 👍

Btw, you don't have to ping me on each comment

thx for info 🙂 So many platforms, so many different logic regarding notifying 😄

I also eventually want the formatter to be smart about preserving the original author's style.

Up to you OFC and in general I think same, but let me share my 2 cents 🙂

If it is possible to not use formatter by default (I think it is, I didn't have it run automatically on save by default or something like that), then I would opt for a release which has a minimal set of features to give any added value for some people. Then improve later to make it better for more people. In other words: let's force always spaces around operators or always no spaces around them, then improve to give a choice.

I think I might be less inclined for smart features, because I got used to Prettier in JS world. Before Prettier I was 100% for customisation, then Prettier got hyped and included in projects I was working on, then I finally got used to the fact that I have little control over style and the main value is in having it consistent.

On the other hand… PICO-8 has limit of characters. If not that, I would for example opt for multiline table syntax as implemented first (and only?), but I know from my own experience such code style hits char limits. In the end I use Lua minified in my own PICO-8 games which allows me to write code which takes a lot of space, both vertically and horizontally. But it would be not nice to expect other users of pico8-ls to minimise their code as well.

OK, that was me sharing thoughts about level of strictness vs flexibility for code formatting 😄 In the end I think I agree with you, but also am aware that the lower the scope of something releasable, the easier it is for me to maintain motivation 😅

beetrootpaul commented 1 year ago

I just treated my entire project with current formatting, it's available on a commit https://github.com/beetrootpaul/dart-07/commit/cb810b089279028ebbc1807e9edda07ce1b6388e .

Some examples of errors I found:

Missing parentheses: (_vs - 29 * 2) / 2 👉 _vs - 29 * 2 / 2
Missing parentheses: (selected_mission - 1) % 4 👉 selected_mission - 1 % 4
Missing "local": local function draw_controls(base_x, base_y) within another function got formatted into function draw_controls(base_x, base_y)
Missing parentheses: (mission_number - 1) * 16 👉 mission_number - 1 * 16
Missing parentheses: withinin table literal property assignment xy = start_xy.plus(\nmagnitude * (rnd() - .5),\nmagnitude * (rnd() - .5)\n), 👉 xy = start_xy.plus(magnitude * rnd() - .5, magnitude * rnd() - .5),
Missing parentheses, case of calling a function chosen with or: (on_started or _noop)() 👉 on_started or _noop()
Missing parentheses: invincible_after_damage_timer and invincible_after_damage_timer.ttl % (2 * invincibility_flash_duration) < invincibility_flash_duration 👉 invincible_after_damage_timer and invincible_after_damage_timer.ttl % 2 * invincibility_flash_duration < invincibility_flash_duration
Missing parentheses: ceil((r_max + 4 * r_step) / speed) 👉 ceil(r_max + 4 * r_step / speed)
Missing parentheses: 1 - (t-1)^2 👉 1 - t - 1 ^ 2
Missing parentheses, case of accessing a property of … and … or … result: (game.health >= segment and health_bar_segment_full or health_bar_segment_empty)._draw(xy.minus(0, 4 + segment * 6)) 👉 game.health >= segment and health_bar_segment_full or health_bar_segment_empty._draw(xy.minus(0, 4 + segment * 6))

ℹ️ I want to try fix some of those calculations before you do @japhib , since it will give me opportunity to learn codebase better 😄

Other than that there were some issues with comments' placing we are aware of. Some examples on screenshots below:

One annoying (but expected) thing was that when I had a function called with 4 args, each one in a separate line because of how long it was, it got joined into a single super long line. The more interesting variant of it was when callback function got preserved multiline, but args got joined after them 😄

See screenshots below:

beetrootpaul commented 1 year ago

and here is a commit which fixes my game's codebase for me: https://github.com/beetrootpaul/dart-07/commit/af490aa7f830461ab038eb312c0532a722f1c799 (game still does not work, since wrong calculations are there due to missing parentheses, but this set of changes makes luamin not crash and makes me able to run a cart )

beetrootpaul commented 1 year ago

I added some failing tests for errors I found

beetrootpaul commented 1 year ago

and fixed formatting of local functions

beetrootpaul commented 1 year ago

FYI I am implementing a fix for applying parentheses in expressions like 1 - (t - 2) ^ 3

japhib commented 1 year ago

What is your approach for applying parentheses? Are you just going to add logic so the parser remembers which expressions have parentheses around them? That's probably the most straightforward way to do it.

beetrootpaul commented 1 year ago

I am implementing it based on what I am used to from other formatters, which is: formatter knows better where parentheses are required and where are not. Therefore my work-in-progress fix so far bases on operator precedence:

Let me know if you see this path as not the best one 😄

BTW I pushed the wip commit to GitHub right now, but please be aware I will most likely amend it later (in other words: do not rely on referring to it by its commit SHA)

beetrootpaul commented 1 year ago

BTW I thought that it might be a good idea to have one extra test which operates on an assumption that AST of formatted code should be same as before formatting. I know it would duplicate a lot of other unit tests, so my motivation is only to have it just expressed in form of just a single test. From a very high level integration test level point of view.

Something like:

take very very very complex PICO-8 Lua file fixture
parse it with a same function server uses, to obtain 1st AST which we will use later for assertions
format the file with same function server uses
now parse the formatted file to obtain 2nd AST
assert 1st and 2nd ASTs are same except loc fields (those might change and it is OK)
format again
parse again to obtain 3rd AST
assert 2nd and 3rd ASTs are exactly same (including loc fields)

And, if introducing at some point a separation of AST (Abstract Syntax Tree) from a FST (Full Syntax Tree), assertions would differ in a way that:

all 3 AST should be exactly same
only 2nd and 3rd FST should be exactly same

Even one more step further: have a lot of test Lua files for that test (added over time) and have this test in form of "for each file from dir". So it would be kind of a final safety net to catch that "something is wrong, it's not as useful as unit tests since assertion is super broad, but we have to dive deeper now and understand what is happening and write that potentially missing one more unit test to cover the edge case".

WDYT @japhib ?

beetrootpaul commented 1 year ago

And for a moment back to parentheses: do I understand correctly that what you are suggesting is to for example pass an extra "isSurroundedWithParentheses = true" param to this.parseExpectedExpression(flowContext) in this place in the code? 👉 https://github.com/japhib/pico8-ls/blob/3aae15c8c45dd0a7e4ee3c1080850c62d3bdf8ca/server/src/parser/parser.ts#L1243-L1245

beetrootpaul commented 1 year ago

I just tested adding an extra boolean about parentheses to parsed BinaryExpression and using that boolean in formatter's visitor for BinaryExpression, my related test pass green 👍
I see you already have some operator priority related code in parser, so maybe I could reuse it if going forward with "let formatter decide" solution: https://github.com/japhib/pico8-ls/blob/a9bd1e3456a163ab86c356d4cb13d454b6f613a9/server/src/parser/parser.ts#L1086-L1118

So my main question to you @japhib is whether you prefer us to follow a path where:

we let formatter decide where to put parentheses (and then if you see it viable to externalise and reuse binaryPrecedence function for formatter to know where to put parentheses), or
we preserve in AST info about parentheses put by user, and formatter just follows that info.

japhib commented 1 year ago

Implementing it based on operator precedence

That's great, definitely a smarter approach. However, rather than hard-coding all the operators inside visitBinaryExpression, you should use the precedence function that already exists in parser.ts: binaryPrecedence(operator: string): number. You can pass in the current operator and the parent operator and then compare the numbers it returns to see which one is higher, and therefore whether parentheses are needed.

Right now, binaryPrecedence() is an instance method on the Parser class, but Formatter doesn't have a reference to a parser. However, that method doesn't need a reference to any of the internal state of the Parser. So I think you'd be safe to pull it out of the class, to be a standalone function.

I prefer this method rather than doing the other thing I said with adding an extra "isSurroundedWithParentheses = true" param.

Extra test comparing the AST's

That seems great! I doubt we'll implement an FST ever (I'm not sure exactly what that would entail anyways), but just testing it on the AST's for now seems great. You can probably use the deepEquals() function inside of test-utils.ts for the comparison.

beetrootpaul commented 1 year ago

you should use the precedence function that already exists

will do exactly that! 🙂

doubt we'll implement an FST ever (I'm not sure exactly what that would entail anyways)

If I understand correctly, FST would include a lot of stylistic info (is there a new line? Is there an extra parenthesis pair? Is there a space after an operator? Is there a comment?) while AST is meant to represent bare bone functionality.

I suppose we can put everything we need to AST and it won't hurt much? I imagine it might just make some operations unnecessary complex in the future, like "hey, I want to know if both parsed trees are equivalent functionality-wise, but I need to know how to distinguish between what is important and what is not" (like my example of a test assertion which should ignore everything that is not pure functionality).

But I believe this is not something necessary for now nor for a long time 🙂

beetrootpaul commented 1 year ago

@japhib pls take a look if my current approach suits you (see commit de21ec475909f4c4f07062b8070ae96b74d51f86 )

What I still have to implement is a adding/removing parentheses depending if expressions are "associative" (if I understand the term correctly). In other words, when we have 2 operators of same precedence, but order does or does not matter:

a + (b + c)   -- should be simplified to -->   a + b + c
a - (b - c)   -- should get preserved as -->   a - (b - c)

beetrootpaul commented 1 year ago

note to self:

new: opts = opts or {} got unnecessary parentheses: opts = opts or ({})
old: misplaced comments
old: missing singular blank lines
old: missing newlines between table params and function params

beetrootpaul commented 1 year ago

I fixed as well a case mentioned 2 comments above. In commit 68eb463bb3fb72e06223f5ac08eb4839fb3c6c0f :

operator related stuff lands in Operators class
Operators.isAssociative(operator: string) got introduced
ChildContext type is passed as an extra param to some expression visitors
wrapWithParenthesesIfNeeded function takes care of adding parentheses around an expression (if needed)

japhib commented 1 year ago

Looking good. I'm working on adding comments inside various statement types, as well as preserving multiline table constructors/function calls.

beetrootpaul commented 1 year ago

Looking good. I'm working on adding comments inside various statement types, as well as preserving multiline table constructors/function calls.

Oh, good to know! Please take a look at my last 2 commits where I already fixed some comment insertions @japhib ! (and added more tests over last several commits). Would be good to avoid difficult merge conflicts 😄

japhib commented 1 year ago

Nice! FYI I pushed a couple more commits for adding comments within expressions like return expressions and some other things. There's a merge commit in there so it's probably going to be hard to cherry-pick the commits. Instead, I'd recommend adding this original repo as another remote, and then you can pull directly from my branch:

# with your `formatter` branch checked out
git remote add upstream git@github.com:japhib/pico8-ls.git
git pull upstream brp-formatter

The only failing unit test now is the one with comments inside a table constructor.

Note: inside several files in this repo, there are some switch statements that look like this:

Please don't re-format these switch statements so that each case takes up multiple lines. When each case is functionally so similar, I find it's more readable this way.

Also, I found this comment of yours:

// TODO: move formatter (in a separate PR) to its own folder, parallel to parser,
//       then move shared statements and expressions outside parser as well.
//       Remember to update test files' locations as well.

Perhaps I don't understand what you mean, but I'm not sure this is a good idea. For the most part, I like the file structure how it is now. But, maybe you can elaborate on what you mean?

beetrootpaul commented 1 year ago

There's a merge commit in there so it's probably going to be hard to cherry-pick the commits. Instead, I'd recommend adding this original repo as another remote, and then you can pull directly from my branch

In the end I just reset my branch on yours @japhib and added some tests from my non-merged commits. There was no reason to merge formatter changes from my commits, since I see your implementation already solves comments preservation for most of tests 🙂 FYI I was not analysing your changes nor compared whether I like my fixes for comments better or not, settling on just choosing your way as the preferred one 🙂

Please don't re-format these switch statements so that each case takes up multiple lines. When each case is functionally so similar, I find it's more readable this way.

No problem! In fact, I wanted to keep it as they are to minimise my changes, but I haven't found an IDE's formatter config for this thing in particular. And since I keep formatting files without thinking (a so called muscle memory), it was tedious to revert that change again and again. Sorry!

Also, I found this comment of yours: (…) For the most part, I like the file structure how it is now. But, maybe you can elaborate on what you mean?

Thanks for asking about that TODO marker! 👍

I left it there to bring it later to discussion, which is what happens right now 👍 So, my reasoning is:

I used to treat parsing separate from formatting. For example because code analysis based on parsing is a separate VS Code action and code formatting, which uses both parser and then formatter, is another thing. It feels natural to me to treat both concepts as equal and separate (but I am aware this is biased, just the way I see it).
Moreover, Formatter is something that is exported to be used on its own as new Formatter(…).
Therefore I see it misleading to have formatter under src/parser/. I would rather see formatter under src/formatter/ while parser still under src/parser/.
If to do that, then it would be good to extract some common shared concepts as well, to not make them part of one of another. I mean, for example AST with its nodes, operators, expressions etc. Like: src/ast/ keeps the most important code abstraction we manipulate on, then src/formatter/ and src/parser depends on it.
Last but not least: if to continue towards modularisation (for purpose of #29), we would probably need to keep parser separate, so another formatter could make use of it (OFC we could bundle default formatter together in a same package as parser, I am just talking about what I see more clean)

So, you see my reasoning, but I have no strong opinion here, especially that it is your project 😄 Just sharing my thoughts 🙂

Since the topic was touched, I removed that comment for now 🙂

beetrootpaul commented 1 year ago

(there is my previous comment above, answering some of your questions, and now in a more general manner)

@japhib , how do you want to proceed with this PR in order to make it good enough to be merged?

Some of my thoughts:

Seems like I won't be writing another PICO-8 game soon (want to do some Godot learning now), therefore I probably won't use pico8-ls this year. Which means moving PR towards merge has higher priority for me (in order to build my GitHub profile) than having all formatter features I would like to. (OFC I won't push for merge against your will 😄 Am just explaining my context).
Just in case it helps you with anything, here is a diff of my game codebase, formatted with the most recent pico8-ls commit 1f77390bd5c75543142b5368fe2d1b16c39e312e 👉 https://github.com/beetrootpaul/dart-07/commit/c0f3f651a3567b9d77be35d5da0b75335be67543 I don't want to specify
I think I can still add some integration tests I was talking about before. I mean like: parse -> format -> parse again -> check if ASTs are equivalent. I feel comfortable with implementing this 🙂
But I do not feel comfortable with working on missing features like comments preservation in table constructors or single empty line preservation between lines of code. It would require to much of my time to go deep enough to deliver anything of value.

japhib commented 1 year ago

RE: the formatting thing - no worries, I understand! Since it's being done automatically, you can feel free to leave it however your IDE is formatting it, and I'll fox the formatting as I like it at some later point when I merge this stuff in.

I see it misleading to have formatter under src/parser/. I would rather see formatter under src/formatter/ while parser still under src/parser/.

That makes sense - I had actually forgotten that all this stuff was under the parser folder. My main point in doing that was to have it separate from server.ts itself. However, there are only a couple files under server/src, and several of the files under server/src/parser also do not strictly belong to parsing. So with that in mind, I think some sort of re-organization makes sense. However, if the parser is going to eventually be extracted into its own package, let's wait until then to determine the final organization.

how do you want to proceed with this PR in order to make it good enough to be merged?

You can step out any time you like! I appreciate you giving me some motivation to move this feature forwards, and we've made a lot of good progress. I'll keep (probably pretty slowly) moving the branch forward and developing on top of your stuff, and then once all the issues & TODOs we've discussed are resolved, I'll merge it into master. I'll be sure to give you credit in the readme/changelog when the formatter is released!

I think I can still add some integration tests I was talking about before. I mean like: parse -> format -> parse again -> check if ASTs are equivalent. I feel comfortable with implementing this

This would be awesome. I'd also check that all the comments are preserved, via Chunk.comments -- their location may have changed, but there should be the same number of comments, and they should be in the same order as before. I'm a little worried about my implementation swallowing comments inside certain types of expressions, so this type of test would be helpful in quieting those fears.

beetrootpaul commented 1 year ago

You can step out any time you like!

I have to admit I really enjoyed the no-pressure async approach we had here @japhib 🙂 👍

I appreciate you giving me some motivation to move this feature forwards, and we've made a lot of good progress. I'll keep (probably pretty slowly) moving the branch forward and developing on top of your stuff, and then once all the issues & TODOs we've discussed are resolved, I'll merge it into master. I'll be sure to give you credit in the readme/changelog when the formatter is released!

💯 👍

This would be awesome.

In commit 432806400da53f7512dc169da5d0d74ac87e94ba :

I added 2 tests (multiplied by N test files):
- one to check if AST of a formatted doc seems roughly same (here I needed to ignore loc and nodes of type Comment, since comments are not there yet before formatting happens; sadly, the latter means the test does not check for comments preservation).
- another one to check if code formatted for the 2nd time is exactly same as after 1st format
- I added low.lua to test files: I saw you refer to low.p8 and low.lua in some ignored tests. I created low.lua by duplicating low.p8 and removing all non-Lua parts from it. Feel free to modify/remove it according to what you originally wanted.
I added some sample files from my projects
- sadly, tests are running very long for some of them. But for some files they run quickly, so you have to comment out some filenames and check by yourself which ones are timing out. It might indicate some performance issues somewhere in parser or formatter or test code 🤔 And… in the end it might be even stranger, because it seems like those tests run slowly in my IDE, but they have no such issue when running from a console 🤷‍♂️

BTW In a commit d8bf459df12f30140b9108e34895de153fb77660 I also added tests for how I imagine empty line preservation could work. It's just my suggestion of how formatter should work, but feel free to remove them 🙂 👍

beetrootpaul commented 1 year ago

One more thing, apart from the message I wrote above:

What do you think about merging the PR with formatting capability disabled @japhib ?

I modified name of this PR to not indicate release ready implementation and I added commit 57e8e88634b15cc975d34bd099fe9c11dbfc024c in which I turned formatting off, which means VS Code won't use formatter which is not ready for release yet (I tested if locally, that one changed line result with VS Code complaining there is no extension which can format pico-8-lua file type). Also, in commit d7c324c2e44721e61563671a45c1cdcc831ff146 I ignored formatter tests that are failing, so in result merge of this PR could result with a nice non-failing build of a the plugin, which has formatter support kinda hidden behind a feature flag.

I see 2 benefits of it:

to prevent branch from "decaying" over time (I mean to avoid difficult to resolve merge conflicts if branch gets merged months from now and other work is made on the main branch; like, already there are some changes to AST you made on a brp-formatter branch)
for me to put a closure on the topic 😄

WDYT?

japhib commented 1 year ago

Hm, I don't love the idea of merging it "just because" when it's still in an unfinished state. There's not work happening on any other branches, including the main branch, so you don't have to worry about having merge conflicts. I'll see if I can get this into a good spot soon and then merge it.

japhib commented 1 year ago

Actually I think you're right, if I'm going to be the one continuing to work on this feature, it's easier if it's contained in the repo itself. I'll go ahead and merge it.

beetrootpaul commented 1 year ago

OK, thx! 🙂

japhib / pico8-ls

Initial work on document formatting support for PICO-8 Lua files #27