purescript / roadmap

Long-term projects not covered by other issues lists
20 stars 1 forks source link

Source code formatting tool #23

Open paf31 opened 8 years ago

paulyoung commented 8 years ago

I can think of a few tools that I would like to use and build that might have similar prerequisites to this one.

I'd also like it if they could be written in PureScript.

Is there some intermediate representation that could be output from the compiler that would be sufficient to achieve this? Perhaps an AST as JSON?

michaelficarra commented 8 years ago

This would probably best be done in Haskell for two reasons: you would have access to the IR already used in psc, and it would be able to be easily pulled into psc if it becomes stable and popular enough.

ozanmakes commented 8 years ago

I'm a fan of hindent, it allows you to reformat declarations and IMO this is better than full source formatters.

I use it when I'm writing PureScript as well, usually adding a few $ operators is enough to make code parseable with haskell-src-exts but I'd love to have a similar tool with full PureScript support.

paulyoung commented 8 years ago

Good points, @michaelficarra.

tfausak commented 8 years ago

For what it's worth, hindent is a full source formatter.

One of the things I don't like about it is that it supports multiple styles. I think the main benefit of a formatter is that it sets the community style, like Go's gofmt or Python's PEP8. I'm a fan of Elm's style guide, which can be enforced with elm-format. I would love it if PureScript had something similar.

archaeron commented 8 years ago

how bad is an approach like this? bear in mind that it's very ugly code atm since it's only a quick proof of concept

https://gist.github.com/archaeron/27c7bdef909c626d7c1c95490243a920

paf31 commented 8 years ago

The main issue with pretty printing is inserting minimal parentheses. The AST does not indicate where they should go.

archaeron commented 8 years ago

shouldn't it insert exactly the parenthesis the user wanted? i.e. not change them?

sometimes I use parenthesis where I know I wouldn't need them, just to make the code clearer

paf31 commented 8 years ago

Even then, they are lost after desugaring. So this could work, but only if you used the data straight out of the parser.

Edit: oh, I see that's what you're doing, sorry 😄

archaeron commented 8 years ago

No problem.

Do you think this approach could work? Or is it better to try a different one? Totally legit to say it won't work :)

Deciding that you could simplify the parens in your code is probably best left to a linter.

paf31 commented 8 years ago

I think it can definitely work.

kritzcreek commented 8 years ago

Looks like we started hacking on the same thing @archaeron https://github.com/kRITZCREEK/ps-pretty :D

archaeron commented 8 years ago

@kRITZCREEK awesome! I'm sure your version will be better :) Very good job on your tooling by the way

I just wanted this really bad, so I tried something.

kritzcreek commented 8 years ago

@archaeron Looking at your code so far I seem to be losing hard :D It seems you've got the instance stuff for ansi-wl-pprint down. Lets continue to work on this!

archaeron commented 8 years ago

@kRITZCREEK how do you want to proceed? I can make my code public. Or I could try to help you with yours.

Not sure if my approach is the best. GHC complains about orphan instances :D

I have one wish for the formatter. I'd very much like if there was an option to format so that it doesn't align to previous lines.

kritzcreek commented 8 years ago

I'll respond tomorrow, I need to sleep for today ;) Am 25.04.2016 11:29 nachm. schrieb "archaeron" notifications@github.com:

@kRITZCREEK https://github.com/kRITZCREEK how do you want to proceed? I can make my code public. Or I could try to help you with yours.

Not sure if my approach is the best. GHC complains about orphan instances :D

I have one wish for the formatter. I'd very much like if there was an option to format so that it doesn't align to previous lines.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/purescript/roadmap/issues/23#issuecomment-214529150

archaeron commented 8 years ago

fair enough. good night

archaeron commented 8 years ago

here is my try if you want to test it out: https://github.com/archaeron/purescript/tree/psc-format

tfausak commented 8 years ago

This line might need to change to make psc-format useful 😉

archaeron commented 8 years ago

@tfausak should I change it to: https://github.com/kRITZCREEK/ps-pretty/blob/master/src/Lib.hs#L14 :grin:

kritzcreek commented 8 years ago

:D I think it's quite clear we're just exploring here ;)

JorisM commented 8 years ago

update:

@archaeron and I have been to the ZuriHac haskell hackathon and decided to work on psc-format.

We have a very basic implementation ready and we can "format" a simple file and it will compile again after we restructured it (although it probably won't compile every PureScript file yet, since we still miss a bunch of stuff, but we have made good progress)

(1) Right now there is still no real formatting going on, but it should be easy enough to implement from here on out. We are happy to hear some opinions on how things should be formatted.

As a first step we propose to define a format which will be hardcoded. Further down the read, we will try to add the possibility that the user can set more options to customize the formatting.

(2) Also, if anyone can take a look at the source to check whether this seams to be a feasible option, that would be greatly appreciated. Since this is the product of a 2 day hackathon, we will have to do some cleanup work soon, so the idea right now is to figure out whether the general direction makes sense or not.

(3) Last, there is also the question of whether we should leave this in the PureScript repo or move it to a separate project.

To try it out: clone repo https://github.com/archaeron/purescript, checkout branch psc-format , build with stack install and then run psc-format --input Main.purs --output Main2.purs to test

paf31 commented 8 years ago

This looks great! I'll be sure to check it out later this week.

I'd love to distribute something like this alongside the compiler actually.

hdgarrood commented 7 years ago

I sort of resurrected this and got it compiling again, and pushed the result to a new psc-format branch on purescript/purescript. One thing that strikes me is that we now have two separate approaches to pretty-printing within the compiler: the approach taken by Language.PureScript.Pretty, which uses the boxes library, and the approach taken in this branch, using ansi-wl-pprint. Is there something about ansi-wl-pprint that makes it better than boxes for this? Do you remember why you went for ansi-wl-pprint?

Also, to what extent is the Language.PureScript.Pretty hierarchy only intended for printing partially desugared code? For example, I tried it out on examples/passing/Console.purs and it got into an infinite loop (although I haven't yet worked out for certain whether this is happening inside the new psc-format code or in the existing Language.PureScript.Pretty code).

edit: I've answered my own questions: having read the paper, the approach taken by ansi-wl-pprint does seem to me to be a bit closer to what we would want for a source code formatter (although perhaps it's too early to tell). Also, the infinite loop was indeed caused by Language.PureScript.Pretty.Values. Since the formatter will only deal with entirely non-desugared code I think it does make sense to have a separate section of the library devoted to it.

archaeron commented 7 years ago

@hdgarrood awsome news :) I've been thinking of taking it up again, but sadly I can't use PureScript at work at the moment :(

ansi-wl-pprint was chosen more or less at random at that time. Altough now that I know more about prettyprinting I'd choose it again over boxes. With ansi-wl-pprint you can annotate nodes when prettyprinting, meaning that you can output a highlighted AST.

hdgarrood commented 7 years ago

Cool, thanks! If anyone wants to keep up with what I've done, I've started pushing commits to the psc-format branch on hdgarrood/purescript instead so that I don't generate so many messages in the IRC channel.

shmish111 commented 7 years ago

@hdgarrood I'm unable to compile your psc-format branch, is it just me or if not can you get it to compile so I can have a play?

hdgarrood commented 7 years ago

@shmish111 I've rebased and cleaned up the commit history a bit. In short I've come to the conclusion that wl-pprint-text isn't going to be suitable (see https://github.com/hdgarrood/purescript/commit/bd2040787b769449369382b46086e989507e057e), so I've started looking at a new approach. I haven't yet got as far as getting it all to compile though.

If you want to play with what I had before, which does compile and does sort of work (in a very, very loose sense), just check out the commit right before that one, https://github.com/hdgarrood/purescript/commit/73c0fd555924b08a29fc1fb514985e6a84368bc5.

shmish111 commented 7 years ago

thanks @hdgarrood

WRT the actual style, who will decide this, is there any code format guidelines already anywhere (I couldn't find any). I personally found the elm style to be very good, it seems a bit verbose at first but it's all based around making commit diffs easier, i.e. avoiding changing a line if it doesn't need to be.

garyb commented 7 years ago

There's this, which is pretty good: https://github.com/ianbollinger/purescript-style-guide

shmish111 commented 7 years ago

@garyb looks perfect

@hdgarrood what is your new approach?

hdgarrood commented 7 years ago

I started working on an alternative to ansi-wl-pprint which I added in the most recent commit in that branch; the general idea is to keep track of the most sensible places to break lines. I think you have to choose between being able to do that and having an associative concatenation operation, and I think the former is more important but ansi-wl-pprint goes for the latter. I added more comments in the code with further explanation.

eklavya commented 6 years ago

Hey everyone, is this still being worked on?

ErikCupal commented 6 years ago

I'm also interested in progress :)

archaeron commented 6 years ago

there is a messy, slowly progressing draft at: https://github.com/archaeron/purescript/tree/psc-format2

eklavya commented 6 years ago

@archaeron thanks for that information. Best of luck :)

coot commented 6 years ago

@archaeron do you have something that could help with: https://github.com/purescript/purescript/issues/1538 (type formatting for purs docs)?

archaeron commented 6 years ago

the mail problem is that I don't use PureScript regularly anymore. If someone else wants to do this, please do. I'd still love to finish this, but time is finite :(

lexun commented 6 years ago

This might be relevant: https://github.com/lpil/purescript-format

paulyoung commented 6 years ago

Maybe this can be of some use: https://www.cs.kent.ac.uk/people/staff/dao7/publ/reprinter2017.pdf

lohmander commented 6 years ago

Any movement on this?

paulyoung commented 6 years ago

@lohmander see https://gitlab.com/joneshf/purty and https://purescript-users.ml/t/purty-1-0-0-released/225

chrisdone commented 6 years ago

Maybe someone could fork hindent, swap out haskell-src-exts for purescript, and make remaining tweaks to the Pretty.hs module.

I'm also desperate for a PureScript printer. For the most part hindent can handle it, but records with their {foo: 1} syntax trip it up.

chrisdone commented 6 years ago

You can select regions of text with hindent.el and reformat them in PureScript. That's how I get by presently without pulling my hair out.

kritzcreek commented 6 years ago

The most promising and advanced effort at this point in time is https://gitlab.com/joneshf/purty

waynevanson commented 3 years ago

2 cents:

shmish111 commented 3 years ago

I believe purty is widely used now no? We are very satisfied with it in all our projects. Can this issue be closed now @paf31 ?

wclr commented 3 years ago

I don't think that purty is good enough. Look at the code snippet from the book:

gcd :: Int -> Int -> Int
gcd n 0 = n
gcd 0 m = m
gcd n m = if n > m
            then gcd (n - m) m
            else gcd n (m - n)

What purty does with it :

gcd :: Int -> Int -> Int
gcd n 0 = n

gcd 0 m = m

gcd n m =
  if n > m then
    gcd (n - m) m
  else
    gcd n (m - n)

It is quite inconsistent in terms of vertical spacing but not only. Generally, it is too dense (vertically) and but in some cases too sparse, and actually doesn't allow any user preferenced control of the layout. So there is a need for a more gentle approach to forced auto-formatting.

chrisdone commented 3 years ago

2 years on I am still using hindent with an Emacs function that pre/post-processes records. If you always wrap records in a data type, then you can pretty much write all your PureScript code as Haskell code. I've settled into a groove. I'm a little set in my ways with hindent's formatting style, so purty's just was a nonstarter for me personally.

maksimil commented 3 years ago

@chrisdone hindent cannot format purescript because it errors on this line.

import Effect.Class (class MonadEffect, liftEffect)