avh4 / elm-format

elm-format formats Elm source code according to a standard set of rules based on the official Elm Style Guide
BSD 3-Clause "New" or "Revised" License
1.31k stars 148 forks source link

Enforce a maximum line length #324

Open lydell opened 7 years ago

lydell commented 7 years ago

Proposal: If a line gets longer than X characters, automatically split it across multiple lines (if possible), to try to make all lines fit within X characters.

What should X be? prettier uses 80 (by default). Perhaps we can measure the average line length in the largest Elm code base in the world or something to find a good value? ;)

Idea: Break this:

gridWithMines =
    Grid.addRandomMinesAndUpdateNumbers model.numMines Set.empty ( seed, emptyGrid )

… into this:

gridWithMines =
    Grid.addRandomMinesAndUpdateNumbers model.numMines
        Set.empty
        ( seed, emptyGrid )

If the user prefers this:

gridWithMines =
    Grid.addRandomMinesAndUpdateNumbers
        model.numMines
        Set.empty
        ( seed, emptyGrid )

… then go ahead and add that newline yourself!


This issue was originally also about automatically joining lines if they can fit within X characters. However, Richard's thoughtful comment quickly made me realize that that is a different, and much harder, issue. Let's start with only the easy one!

Original title: Automatically split long lines and join short ones?

Original description:

I used to miss elm-format every time I went back to writing JavaScript. Then prettier was born. Now it has started to feel the other way around – I kind of miss prettier when I go to Elm from JavaScript.

The cool thing about prettier is that it tries to fit all lines within 80 columns, in a nice way. (The max width is configurable in prettier, but defaults to 80).

With elm-format, though, I need to tell it myself if I want an expression to be on a single line or on multiple lines.

I usually write crappy Elm (and JavaScript) and hit Save every now and then to let my editor auto-format. That’s really nice. What I’ve found is that since I started using prettier, I hit save and my brain automatically expects some multiline Elm expressions to fold up to a single line, but it doesn’t. Then I hit the "join line" editor shortcut a few times to help elm-format on the way. (It’s easier when breaking long lines: just add a single newline somewhere and elm-format takes it from there. But it would be nice if elm-format decided on its own that lines are too long.)

Does it make sense to do something similar in Elm?

avh4 commented 7 years ago

I'm open to this idea, but I think it would demand quite a bit of exploration work to understand the best algorithm for wrapping and the best choice of line length.

rtfeldman commented 7 years ago

So this is two separate ideas, yeah?

Removing Single-Line vs Multiline Choice

I talked to James (creator of prettier) last week and he pointed out that people complain about how Prettier does this.

Sometimes you have 4 similar sections in a block of code, and you want to style them in a consistent multiline style, but Prettier won't let you because one of them is too short and it's forced to wrap differently from the others.

Also, as we learned a year or so ago, there are some cases where you want functions to break like this:

blah
    foo
    bar
    baz

Example:

createUser
    firstName
    lastName
    email

...and others where you want this:

blah foo
    bar
    baz

Example:

describe "some unit test"
    [ testsGoHere ]

Forcing either of the above to go the other way looks bad:

createUser firstName
    lastName
    email
describe
    "some unit test"
    [ testsGoHere ]

...so between the options of "remove this choice and guarantee that some code will look wrong" versus "leave it as a choice for the programmer," the latter seems better.

At the end of the day, there will always be stylistic choices for programmers to make. How do you order your function arguments? Do you put something in a let or extract it to the top level? Record or tuple?

I think it's an important design goal of elm-format to prevent teams from having style debates, but I think "same AST generates the same formatted code" throws the baby out with the bathwater.

Enforcing Fixed Column Width

Enforcing 80-column width (not configurable under any circumstances, of course) seems like a separate discussion. 🙂

lydell commented 7 years ago

Thanks for the super nice summary! I decided to drop the "automatically join" part to only focus on the more important (and simpler!) "automatically split" part. I've updated the issue description accordingly.

Sometimes you have 4 similar sections in a block of code, and you want to style them in a consistent multiline style, but Prettier won't let you because one of them is too short and it's forced to wrap differently from the others.

Just for the record, when starting to use prettier I noticed this too. But I quickly got over it. To me, (local) readability always wins over "consistency". I haven't really found such "consistency" between code blocks to help, other than possibly please the eyes.

rtfeldman commented 7 years ago

What should happen if a line is over 80 characters and there is no syntactically valid way to get it under 80?

lydell commented 7 years ago

Leave it unchanged! That's the way prettier works. For example, very long string literals.

rtfeldman commented 7 years ago

Hm, so if this can't get you an invariant (e.g. "set your terminal to 80 characters wide and everything will fit"), what's the benefit? Just that you don't have to decide whether to break a long line?

lydell commented 7 years ago

I guess all this feature request boils down to is: "I don't want to manually insert that single line break in long lines to tell elm-format to print it multi-line".

avh4 commented 7 years ago

What would be useful for me is to have some specific examples here of 1) elm code where you've felt like this should happen, 2) JavaScript code where prettier makes things very nice, and 3) JavaScript code that prettier formats badly

avh4 commented 7 years ago

*more examples

lydell commented 7 years ago

I'll be on the lookout for such examples! Before formatting, I'll try to remember to save code snippets with long lines that would be split up.

avh4 commented 7 years ago

From some real-world code:

It seems like

standards : Html Msg
standards =
    section [ class [ Standards ] ]
        [ div [ class [ StandardsDescription ] ]
            [ Html.h3 [ class [ StandardsDescriptionHeader ] ] [ Html.text "Aligned to Your Standards" ]
            , p [ class [ StandardsDescriptionParagraph ] ] [ Html.text "NoRedInk Premium is aligned to state and national standards. Instant data and live tracking capabilities make it easy to prioritize, instruct, and intervene." ]
            ]
        ]

should wrap to

standards : Html Msg
standards =
    section [ class [ Standards ] ]
        [ div [ class [ StandardsDescription ] ]
            [ Html.h3 [ class [ StandardsDescriptionHeader ] ]
                [ Html.text "Aligned to Your Standards" ]
            , p [ class [ StandardsDescriptionParagraph ] ]
                [ Html.text "NoRedInk Premium is aligned to state and national standards. Instant data and live tracking capabilities make it easy to prioritize, instruct, and intervene." ]
            ]
        ]

(We've had some new devs initially want to put big chunks of HTML on a single line; and the same for elm-css definitions.)

However, should this

                        featureColumn
                            [ { imageUrl = AssetPath "/assets/marketing_site/clever.svg", title = "Clever Integration", body = "Clever Secure Sync and Instant Login means NoRedInk integrates with the rest of your district's tools that use Clever." }
                            , { imageUrl = AssetPath "/assets/marketing_site/admin-account.svg", title = "Specialized Administrator Account Privileges", body = "Take a deeper dive to see how teachers and students are using the product every day." }
                            , { imageUrl = AssetPath "/assets/marketing_site/reports.svg", title = "Data-driven Reports", body = "Continually identify up-to-date usage patterns and levels of mastery across your school or district." }
                            ]

format to:

                        featureColumn
                            [ { imageUrl = AssetPath "/assets/marketing_site/clever.svg"
                              , title = "Clever Integration"
                              , body = "Clever Secure Sync and Instant Login means NoRedInk integrates with the rest of your district's tools that use Clever."
                              }
                            , { imageUrl = AssetPath "/assets/marketing_site/admin-account.svg"
                              , title = "Specialized Administrator Account Privileges"
                              , body = "Take a deeper dive to see how teachers and students are using the product every day."
                              }
                            , { imageUrl = AssetPath "/assets/marketing_site/reports.svg"
                              , title = "Data-driven Reports"
                              , body = "Continually identify up-to-date usage patterns and levels of mastery across your school or district."
                              }
                            ]

Initially I was thinking no, because it makes the file expand out more, but actually it is easier to navigate if you happen to be looking for something inside of that list.

rtfeldman commented 7 years ago

What should happen in the case where you have a long thing that you'd want to break the other way? If it always breaks using the "argument on the same line" style, you get this:

createUser firstName
    lastName
    email
    reallyLongThingThatCausedThisToGoMultiline

One option would be that it just picks a way in order to satisfy the maximum line width, and then you can swap it to the other one by hand if you like. (So long as it still satisfies the maximum width.)

avh4 commented 7 years ago

Here's some more real code that would have been good to auto-wrap:

 type InstructionHint msg
     = NoHint
-    | Hint { hintType : HintType, hintContent : Maybe (Html msg), onHintToggle : msg }
+    | Hint
+        { hintType : HintType
+        , hintContent : Maybe (Html msg)
+        , onHintClick : msg
+        }
lydell commented 7 years ago

@avh4 I agree, all of those are great examples where auto-wrap would be nice (even the list of records)!

One option would be that it just picks a way in order to satisfy the maximum line width, and then you can swap it to the other one by hand if you like.

@rtfeldman Exactly! I tried to convey this in the issue description, but perhaps I was too vague? :)

lydell commented 7 years ago

The by far most common case I come across in Elm is html elements that get more and more attributes over time. At first, the element only has one or two attributes, so it fits just fine in one line. But when it get, say, five of them I want to move each attribute to its own line.

    table [ class "Grid" ]
        [ tbody [] (List.map viewRow rows) ]

    table [ class "Grid", id gridId, style styles, tabindex -1, onKeydown GridKeydown ]
        [ tbody [] (List.map viewRow rows) ]

    table
        [ class "Grid"
        , id gridId
        , style styles
        , tabindex -1
        , onKeydown GridKeydown
        ]
        [ tbody [] (List.map viewRow rows) ]
rtfeldman commented 7 years ago

I think I've encountered it even more often with exposing - although in that case I'm often reluctant to manually split it. 😄

lydell commented 7 years ago

Correct – exposing is definitely the number one case. I would not mind elm-format automatically splitting long those!

lydell commented 7 years ago

I would not mind elm-format automatically splitting long those!

Hmm … I might have to take that back. I worked on a view file today where the imports ended up taking a whole screen-full. So I tried joining each import to a single line. They became quite long – but it I think I like it better! I don’t really read the imports, so it was nicer to get to see the actual contents of the without scrolling a full page.

So perhaps that’s an indication that everything is fine the way it is? (Or should there b an exception for imports?)

rtfeldman commented 7 years ago

Let's keep exploring this. 😃

Long exposing lines causing merge conflicts

Css.elm is often a fun pathological case when it comes to imports and file lengths, because that one module exposes about 600 values.

For a long time I kept this on one line, because I didn't want to have to scroll so much to get to the point where my file actually began.

However, over time this led to merge conflicts. If two different people made Pull Requests that added to the API, they'd both modify that one line (to add different exposed values), so merging one created a conflict with the other.

Eventually I realized I needed to go multiline and bit the bullet. If elm-format had done this for me automatically, I would have saved myself time dealing with those merge conflicts.

In practice, I honestly haven't noticed that I did that, because there was so much to scroll past at the top of the file anyway (imports, documentation, etc.) that when editing it, I always either scrolled straight to the end to start adding something new, or else used some editor tool to jump to some particular point in the file (line number, search, etc.) that I wanted to edit.

Put another way, I already wasn't scrolling down from the top, so the fact that exposing now takes up so much more vertical space hasn't been a pain point for me.

Remaining Merge Conflicts

I still have merge conflicts because of @docs that exceed one line. Here it would actually be really nice for elm-format to auto-wrap these for me, because unlike with exposes, I can't just press enter, save, and have it switch to "multiline mode" to take care of the remaining formatting. Properly wrapping @docs to the next line means adding a fresh @docs to the beginning of the wrapped line.

Before

{-|

# Pseudo-Classes
@docs pseudoClass, active, any, checked, dir, disabled, empty, enabled, first, firstChild, firstOfType, fullscreen, focus, hover, visited, indeterminate, invalid, lang, lastChild, lastOfType, link, nthChild, nthLastChild, nthLastOfType, nthOfType, onlyChild, onlyOfType, optional, outOfRange, readWrite, required, root, scope, target, valid

-}

After

{-|

# Pseudo-Classes
@docs pseudoClass, active, any, checked, dir, disabled, empty, enabled, first,
@docs firstChild, firstOfType, fullscreen, focus, hover, visited, indeterminate, invalid,
@docs lang, lastChild, lastOfType, link, nthChild, nthLastChild, nthLastOfType,
@docs nthOfType, onlyChild, onlyOfType, optional, outOfRange, readWrite, required,
@docs root, scope, target, valid

-}

Takeaways

I had the option of choosing single-line over wrapping, and later regretted my choice of single-line. If elm-format auto-wrapped line widths, it would have removed my ability to make the wrong choice, which would have saved me time resolving merge conflicts in the long run. In this instance, having the choice proved counterproductive.

My impression is that in application code, exposing lists tend to be short - because they're often either exposing (..) or exposing (view) or something like that. For libraries, there is at least some evidence suggesting that having exposing lists auto-wrap is a better default.

rtfeldman commented 7 years ago

Another thought: if exposing is auto-wrapped, it doesn't have to be "one per line" - it can be something like:

module Foo exposing (bar, baz, qux, blah,
   something, somethingElse)

Given my experience with Css.elm and merge conflicts, it would have been bad for me if it did this. One per line would have been preferable.

However, this makes me question how @docs should be split. Wouldn't that same argument apply there? Shouldn't the multi-line version actually be:

{-|

# Pseudo-Classes
@docs pseudoClass
@docs active
@docs any
@docs checked
@docs dir
@docs disabled
@docs empty
@docs enabled
@docs first
@docs firstChild
@docs firstOfType
@docs fullscreen
@docs focus
@docs hover
@docs visited
@docs indeterminate
@docs invalid
@docs lang
@docs lastChild
@docs lastOfType
@docs link
@docs nthChild
@docs nthLastChild
@docs nthLastOfType
@docs nthOfType
@docs onlyChild
@docs onlyOfType
@docs optional
@docs outOfRange
@docs readWrite
@docs required
@docs root
@docs scope
@docs target
@docs valid

-}

By my own logic, this is the correct multi-line way to do it, but at the same time this looks like an uncomfortably large list. (Which, of course, is how I felt about exposing, and I am saying this while being fully aware that I really should switch to this in Css.elm for the sake of preventing merge conflicts.)

I guess I should go try that and see how it goes. 😛

50kudos commented 7 years ago

On "one per line" topic, this following makes me come to this issue. And it's more like a general case for long List:

board : List (List Multiplier)
board =
    [ [ E3, X1, X1, P2, X1, X1, X1, E3, X1, X1, X1, P2, X1, X1, E3 ]
    , [ X1, E2, X1, X1, X1, P3, X1, X1, X1, P3, X1, X1, X1, E2, X1 ]
    , [ X1, X1, E2, X1, X1, X1, P2, X1, P2, X1, X1, X1, E2, X1, X1 ]
    , [ P2, X1, X1, E2, X1, X1, X1, P2, X1, X1, X1, E2, X1, X1, P2 ]
    , [ X1, X1, X1, X1, P3, X1, X1, X1, X1, X1, P3, X1, X1, X1, X1 ]
    , [ X1, P3, X1, X1, X1, P3, X1, X1, X1, P3, X1, X1, X1, P3, X1 ]
    , [ X1, X1, P2, X1, X1, X1, P2, X1, P2, X1, X1, X1, P2, X1, X1 ]
    , [ X1, X1, X1, X1, X1, X1, X1, X_, X1, X1, X1, X1, X1, X1, X1 ]
    , [ X1, X1, P2, X1, X1, X1, P2, X1, P2, X1, X1, X1, P2, X1, X1 ]
    , [ X1, P3, X1, X1, X1, P3, X1, X1, X1, P3, X1, X1, X1, P3, X1 ]
    , [ X1, X1, X1, X1, P3, X1, X1, X1, X1, X1, P3, X1, X1, X1, X1 ]
    , [ P2, X1, X1, E2, X1, X1, X1, P2, X1, X1, X1, E2, X1, X1, P2 ]
    , [ X1, X1, E2, X1, X1, X1, P2, X1, P2, X1, X1, X1, E2, X1, X1 ]
    , [ X1, E2, X1, X1, X1, P3, X1, X1, X1, P3, X1, X1, X1, E2, X1 ]
    , [ E3, X1, X1, P2, X1, X1, X1, E3, X1, X1, X1, P2, X1, X1, E3 ]
    ]

I want board : List Multiplier, not the nested one but if I make it one long list, it wraps all the way down to eternity :deer:

AleXoundOS commented 6 years ago

Enforcing a limit of text width is one of the basics of coding style guides IMHO.

The number of characters itself is a question. Whether it is 72, 80, 100, etc. There are many preferences and no global consencus atm. So perhaps it should be a parameter to elm-format defaulting to 80 when omitted.

rtfeldman commented 6 years ago

The more I've used Prettier and rustfmt, which enforce line length, and elm-format, which doesn't, the more strongly I'm convinced that it's better not to enforce line length.

Having now had extensive experience with both, I think elm-format's current design gets this right!

blackeyeddan commented 4 years ago

The more I've used Prettier and rustfmt, which enforce line length, and elm-format, which doesn't, the more strongly I'm convinced that it's better not to enforce line length.

Having now had extensive experience with both, I think elm-format's current design gets this right!

I agree!

I've used elm-format and Prettier both for the last couple of years. Prettier's line length enforcement causes a great deal of diff churn.

If I recall correctly, on an Elm Town podcast, Aaron mentions minimizing diff churn as one of the factors in deciding how elm-format formats code. Thank you for making that choice!

AleXoundOS commented 4 years ago

Maybe the diff topic is more about diffing algorithm rather than line length enforcement. Though, this may require to analyze semantics for the diff tool?

Also, it's pretty ok when latex, orgmode, markdown text is unlimited in line length. Because they have no syntax sensitive whitespace and long line wrapping does not confuse cognitive ability to decide whether it is a wrapped line or a syntax block with specific indentation. Long unwrapped lines are also bad, since it requires to scroll both in vertical and horizontal directions.

MartinSStewart commented 3 years ago

Another advantage of automatically adding line breaks to long lines is code generation. For example, elm-review rules can automatically make fixes to the users code. After it does that, elm-review runs elm-format to make the generated code look nicer. Currently elm-review rule authors still have to make sure that some line breaks are inserted to avoid all the generated code from ending up on one very long line. Having elm-format handle this as well would improve the quality of generated code.