avh4 / elm-format

elm-format formats Elm source code according to a standard set of rules based on the official Elm Style Guide
BSD 3-Clause "New" or "Revised" License
1.31k stars 145 forks source link

Nesting |> andThen consumes too much indent #352

Open jinjor opened 7 years ago

jinjor commented 7 years ago

Today I first tried elm-format and shocked with this result. It consumes 12 spaces per one chain.

-  model.floor
-    |> Maybe.map EditingFloor.present
-    |> Maybe.andThen (\floor -> List.head (selectedObjects model)
-    |> Maybe.andThen (\primarySelected ->
-      ObjectsOperation.nearest direction primarySelected (Floor.objects floor)
-    |> Maybe.map (\object ->
-      { model |
-        selectedObjects =
-          List.map Object.idOf [object]
-      }
-      )))
-    |> Maybe.withDefault model
+    model.floor
+        |> Maybe.map EditingFloor.present
+        |> Maybe.andThen
+            (\floor ->
+                List.head (selectedObjects model)
+                    |> Maybe.andThen
+                        (\primarySelected ->
+                            ObjectsOperation.nearest direction primarySelected (Floor.objects floor)
+                                |> Maybe.map
+                                    (\object ->
+                                        { model
+                                            | selectedObjects =
+                                                List.map Object.idOf [ object ]
+                                        }
+                                    )
+                        )
+            )
+        |> Maybe.withDefault model

In Haskell, this may be flattened by do notation. So I don't think the former style is too strange.

Another alternative would be the following style.

+    model.floor
+        |> Maybe.map EditingFloor.present
+        |> Maybe.andThen (\floor ->
+            List.head (selectedObjects model)
+                |> Maybe.andThen (\primarySelected ->
+                    ObjectsOperation.nearest direction primarySelected (Floor.objects floor)
+                        |> Maybe.map (\object ->
+                            { model
+                                | selectedObjects =
+                                    List.map Object.idOf [ object ]
+                            }
+                        )
+                )
+        )
+        |> Maybe.withDefault model

This is 8 spaces per one chain.

And this is 4 spaces.

+    model.floor
+        |> Maybe.map EditingFloor.present
+        |> Maybe.andThen (\floor -> List.head (selectedObjects model)
+            |> Maybe.andThen (\primarySelected -> ObjectsOperation.nearest direction primarySelected (Floor.objects floor)
+                |> Maybe.map (\object ->
+                    { model
+                        | selectedObjects =
+                            List.map Object.idOf [ object ]
+                    }
+                )
+            )
+        )
+        |> Maybe.withDefault model

I found this was once discussed before. No progress on this issue since then? I don't have clear idea but wanted to bring it up again before public release. Is it possibly true that elm-lang/core will be formatted in the future? It has similar code too.

ghost commented 7 years ago

I'm facing a similar problem. It's hard to spot errors where Json.Decode.mapX has its arguments in the wrong order, so we are using the next best thing that Elm has to a do notation, which would look something like

decodeEngineResult : Decoder Rules.EngineResult
decodeEngineResult =
    Decode.field "consideredCount" Decode.int &> \consideredCount ->
    Decode.field "evaluationCount" Decode.int &> \evaluationCount ->
    Decode.field "matchedCount" Decode.int &> \matchedCount ->
    Decode.field "time" Decode.string &> \time ->
    Decode.succeed
      { consideredCount = consideredCount
      , evaluationCount = evaluationCount
      , matchedCount = matchedCount
      , time = time
      }

However, elm-format formats the above to:

decodeEngineResult : Decoder Rules.EngineResult
decodeEngineResult =
    Decode.field "consideredCount" Decode.int
        &> \consideredCount ->
            Decode.field "evaluationCount" Decode.int
                &> \evaluationCount ->
                    Decode.field "matchedCount" Decode.int
                        &> \matchedCount ->
                            Decode.field "time" Decode.string
                                &> \time ->
                                    Decode.succeed
                                        { consideredCount = consideredCount
                                        , evaluationCount = evaluationCount
                                        , matchedCount = matchedCount
                                        , time = time
                                        }

Which is a lot more difficult to read.

BTW, thank you for elm-format. Despite its little shortcomings it's a fantastic tool, I just don't write Elm without it. Please let me know if I can help.

avh4 commented 7 years ago

I agree that I think we should find a better way to format this.

However, in @xarvh-at-stax's example, it can be cleaned up using NoRedInk/elm-decode-pipeline:

import Json.Decode.Pipeline exposing (decode, required)

decodeEngineResult : Decoder EngineResult
decodeEngineResult =
    decode EngineResult
        |> required "consideredCount" Decode.int
        |> required "evaluationCount" Decode.int
        |> required "matchedCount" Decode.int
        |> required "time" Decode.string

Is there an advantage to preferring the lambdas in that case?

ghost commented 7 years ago

@avh4 the advantage is that it's harder to swap arguments by mistake.

The code below is perfectly valid, looks good, but it's wrong.

import Json.Decode.Pipeline exposing (decode, required)

decodeEngineResult : Decoder EngineResult
decodeEngineResult =
    decode EngineResult
        |> required "evaluationCount" Decode.int
        |> required "consideredCount" Decode.int
        |> required "matchedCount" Decode.int
        |> required "time" Decode.string

The first two fields are swapped, but it's really hard to tell.

Worse, the same problem will happen if we reorder the attributes in the definition of EngineResult.

avh4 commented 7 years ago

(for reference) Here's a version of the original example that actually compiles:

type alias Model =
    { floor : Maybe Floor
    , selectedObjects : List Id
    }

type Floor
    = Floor

type Object
    = Object

type Id
    = Id

type Direction
    = Direction

editingFloor_present : Floor -> Floor
editingFloor_present a =
    a

selectedObjects : Model -> List Object
selectedObjects _ =
    []

floor_objects : Floor -> List Object
floor_objects _ =
    []

object_idOf : Object -> Id
object_idOf _ =
    Id

objectsOperation_nearest : Direction -> Object -> List Object -> Maybe Object
objectsOperation_nearest _ _ _ =
    Nothing

x : Direction -> Model -> Model
x direction model =
    model.floor
        |> Maybe.map editingFloor_present
        |> Maybe.andThen
            (\floor ->
                List.head (selectedObjects model)
                    |> Maybe.andThen
                        (\primarySelected ->
                            objectsOperation_nearest direction primarySelected (floor_objects floor)
                                |> Maybe.map
                                    (\object ->
                                        { model
                                            | selectedObjects =
                                                List.map object_idOf [ object ]
                                        }
                                    )
                        )
            )
        |> Maybe.withDefault model
avh4 commented 7 years ago

Here's a possible alternative to the originally-posted code:

x : Direction -> Model -> Model
x direction model =
    let
        primarySelected =
            selectedObjects model
                |> List.head

        floorObjects =
            model.floor
                |> Maybe.map editingFloor_present
                |> Maybe.map floor_objects
                |> Maybe.withDefault []

        nearestFloorObject primarySelected =
            objectsOperation_nearest direction primarySelected floorObjects
    in
        { model
            | selectedObjects =
                primarySelected
                    |> Maybe.andThen nearestFloorObject
                    |> Maybe.map object_idOf
                    |> Maybe.map List.singleton
                    |> Maybe.withDefault model.selectedObjects
        }
jinjor commented 7 years ago

This refactoring makes sense a lot. Thank you :)

One point I care about is that the benefit of Html.Lazy.lazy decreases because re-assigning record property (selectedObjects) will lose the original record's reference.

jligeza commented 7 years ago

Wouldn't this be better?

a
  |> \x -> x
  |> \y -> y
  |> \z -> z

I have a pipe with a single small lamba in the middle of it, and it increases the indentation for no reason, what makes the pipe harder to read.

timjs commented 7 years ago

I just encountered that the experimental version (0.7.0) formats the following code

checkPattern Unpack preTy >>= \preEnv ->
checkPattern Unpack postTy >>= \postEnv ->
checkFlow (Env.merge preEnv env) body >>= \bodyEnv ->
checkSubset postEnv.values bodyEnv.values

like this

checkPattern Unpack preTy
    >>= (\preEnv ->
            checkPattern Unpack postTy
                >>= (\postEnv ->
                        checkFlow (Env.merge preEnv env) body
                            >>= (\bodyEnv ->
                                    checkSubset postEnv.values bodyEnv.values
                                )
                    )
        )

(Yes, I use bind sometimes :wink:)

I can live with the indentation, because I think it makes scoping explicit, but I'm not sure about two things:

  1. Why are lines after a bind indented two levels?
  2. Do we need to add parens for the lambda?

Btw: when using andThen the parens are needed and the formatting uses one indent level at a time.

After fixing the double indentation (1):

checkPattern Unpack preTy
    >>= (\preEnv ->
        checkPattern Unpack postTy
            >>= (\postEnv ->
                checkFlow (Env.merge preEnv env) body
                    >>= (\bodyEnv ->
                        checkSubset postEnv.values bodyEnv.values
                        )
                )
        )

and removing parens (2):

checkPattern Unpack preTy
    >>= \preEnv ->
        checkPattern Unpack postTy
            >>= \postEnv ->
                checkFlow (Env.merge preEnv env) body
                    >>= \bodyEnv ->
                        checkSubset postEnv.values bodyEnv.values

What do you think? I prefer the parens-less version, it looks less cluttered and they are actually redundant. But removing the double indentation already improves a lot!

Edit: the code looks even better when we remove the indentation before the bind operator. You can perfectly see where each lambda variable is bound! This formatting obviously has consequences for a lot of other operators, like (|>) though...

checkPattern Unpack preTy
>>= \preEnv ->
    checkPattern Unpack postTy
    >>= \postEnv ->
        checkFlow (Env.merge preEnv env) body
        >>= \bodyEnv ->
            checkSubset postEnv.values bodyEnv.values
xarvh commented 6 years ago

Has there been any more thought on this?

I think it's a major bug with elm-format.

This is ugly, but at least is readable and maintainable:

mapDecoder : Decoder Map
mapDecoder =
        (field "name" <| Decode.string) |> Decode.andThen (\name ->
        (field "author" <| Decode.string) |> Decode.andThen (\author ->
        (field "mainBases" <| listOfTilesDecoder) |> Decode.andThen (\mainBases ->
        (field "smallBases" <| listOfTilesDecoder) |> Decode.andThen (\smallBases ->
        (field "wallTiles" <| listOfTilesDecoder) |> Decode.andThen (\wallTiles ->
          Decode.succeed
            { name = name
            , author = author
            , mainBases = mainBases
            , smallBases = smallBases
            , wallTiles = wallTiles
            }
        )))))

But the elm-format output is:

    (field "name" <| Decode.string)
        |> Decode.andThen
            (\name ->
                (field "author" <| Decode.string)
                    |> Decode.andThen
                        (\author ->
                            (field "mainBases" <| listOfTilesDecoder)
                                |> Decode.andThen
                                    (\mainBases ->
                                        (field "smallBases" <| listOfTilesDecoder)
                                            |> Decode.andThen
                                                (\smallBases ->
                                                    (field "wallTiles" <| listOfTilesDecoder)
                                                        |> Decode.andThen
                                                            (\wallTiles ->
                                                                Decode.succeed
                                                                    { name = name
                                                                    , author = author
                                                                    , mainBases = mainBases
                                                                    , smallBases = smallBases
                                                                    , wallTiles = wallTiles
                                                                    }
                                                            )
                                                )
                                    )
                        )
            )

Which is not maintainable by a human.

avh4 commented 6 years ago

From what I'm currently seeing in the community, I think the current recommended practice for the last example would be:

mapDecoder : Decoder Map
mapDecoder =
    Decode.map5 Map
        (field "name" Decode.string)
        (field "author" Decode.string)
        (field "mainBases" listOfTilesDecoder)
        (field "smallBases" listOfTilesDecoder)
        (field "wallTiles" listOfTilesDecoder)

Though that still has the issue ghost mentioned https://github.com/avh4/elm-format/issues/352#issuecomment-298824192

xarvh commented 6 years ago

Yup. I'd really like to avoid argument swaps, especially since it can happen by just reordering the attributes in the record.

avh4 commented 6 years ago

For completeness, there are multiple ways to help avoid mixing up the attributes:

In what cases should nesting be preferred over other alternatives? I think defining more types is the traditional functional-programming solution to this problem--is it common for people to do that in Elm?

rtfeldman commented 6 years ago

Here are some points to add to the discussion:

  1. Since the library came out, I have heard about the concern that people can swap arguments in decode pipeline and cause bugs.
  2. Nearly everyone I talk to who does any JSON decoding uses that library to do it.
  3. Even with so many people using the library, in practice I do not hear people complaining that they have the pain point where they swapped the arguments and it caused bugs for them.

I understand that this can happen, but I don't think it's fair to say that it's a "major bug with elm-format" that elm-format doesn't accommodate a technique that would let you avoid using one of the most popular libraries in the Elm ecosystem, out of concern for a problem that isn't reported to be a significant problem in practice.

xarvh commented 6 years ago

Fair enough. =)

avh4 commented 6 years ago

I do not hear people complaining that they have the pain point concern for a problem that isn't reported to be a significant problem in practice

There are people expressing that pain point in this very thread. Please help keep the conversation here productive by not invalidating other people's direct experiences.

rtfeldman commented 6 years ago

Argh, I'm sorry! I definitely do not want to invalidate anyone's experience.

My response was based my understanding that this is a hypothetical concern rather than one based on direct experience. I read "the advantage is that it's harder to swap arguments by mistake" as distinct from "I have tried it that way and personally experienced the pain of swapping arguments by mistake."

In retrospect, a better way to convey my point might have been to ask for clarification, e.g. "have you tried using decode pipeline to see if this concern materializes for you in practice?"

xarvh commented 6 years ago

I should have stated this more clearly myself: it happened a couple of times in production.

rtfeldman commented 6 years ago

Wow, mea maxima culpa! Sorry for assuming incorrectly.

rtfeldman commented 6 years ago

I think I also missed an opportunity to ask a pertinent question to the thread:

Given that Elm's design encourages "one way to do it" (and one of elm-format's goals is to prevent team arguments over personal preferences), is there a case to be made that there is "one best way to do it" for JSON decoding, or is it better to provide good support for multiple alternative approaches?

Possibilities that occur to me:

  1. mapN and decode pipeline should not be used. It's better to use a nested andThen style because it is more resilient to ordering mistakes. If this is true, I see a clear case that elm-format should change for the sake of the andThen style.
  2. mapN and decode pipeline should be used, and they are better than using a nested andThen style because they are significantly more concise but insignificantly more error-prone. If this is true, I don't see a clear case that elm-format should change for the sake of the andThen style.
  3. All of these styles are fine. If this is true, is one style significantly better than the other for certain projects? (If so, which projects are better suited to which style?) If not, does which style to use come down to personal preference? (If it's mostly personal preference, what happens when some team members prefer one style and others on the team prefer the other style? Preventing team arguments over mutually exclusive personal preferences has been a longstanding design goal of elm-format.)
avh4 commented 6 years ago

I think it's beyond the scope of elm-format to decide what the "one way to do it" should be for the elm community. elm-format's goal is to be the best tool given the community's current practices. Debating which way people should do it should be done elsewhere (discourse or slack); For the purposes of this issue, this is the current summary as I understand it:

Open questions for me:

xarvh commented 6 years ago

I would like to make a small point: if a particular construct becomes easier to use, people will use it. If nested lambda are a pain because of how elm-format indents them, people will get used to the alternative instead.

xarvh commented 6 years ago

Elm has already a solution that I think is very good:

(&>) = flip Decode.andThen

mapDecoder : Decoder Map
mapDecoder =
    field "name" Decode.string &> \name ->
    field "author" Decode.string &> \author ->
    field "halfWidth" Decode.int &> \halfWidth ->
    field "halfHeight" Decode.int &> \halfHeight ->
    field "mainBases" setOfTilesDecoder &> \mainBases ->
    field "smallBases" setOfTilesDecoder &> \smallBases ->
    field "wallTiles" setOfTilesDecoder &> \wallTiles ->
      Decode.succeed
          { name = name
          , author = author
          , halfWidth = halfWidth
          , halfHeight = halfHeight
          , mainBases = mainBases
          , smallBases = smallBases
          , wallTiles = wallTiles
          }

It does not require any new syntax and IMHO is the best solution available short of automated decoders.

The two problems are that 1) this solution is currently not compatible with elm-format and I have no clue how difficult it would be to implement and that 2) in the future we'll need (&>) to be exposed by Json.Decode.

avh4 commented 3 years ago

A detailed write-up of the impact of the current formatting was added here: https://github.com/avh4/elm-format/issues/734#issuecomment-798966212

xarvh commented 3 years ago

This tool we developed at Webbhuset removes extra newlines and indents:

https://gist.github.com/xarvh/1b65cb00e7240f1ccfa0bdbf30f97c62

ie from

eval (App e1 e2) =
    State.do (eval e1) <|
        \f1 ->
            State.do (eval e2) <|
                \f2 ->
                    State.pure (App f1 f2)

to

eval (App e1 e2) =
  State.do (eval e1) <| \f1 ->
  State.do (eval e2) <| \f2 ->
  State.pure (App f1 f2)

It's battle tested and fairly reliable.

This was important for us because we have a very large code base that requires a lot of do-notation. While this is a hack and the clean solution would be to modify/fork elm-format, this does what we need and we can finally run elm-format on all our codebase rather than "only if the module does not use do-notation, and only if I remember, and only if it won't turn all diff history upside down"...