avh4 / elm-format

elm-format formats Elm source code according to a standard set of rules based on the official Elm Style Guide
BSD 3-Clause "New" or "Revised" License
1.31k stars 148 forks source link

Would it be possible to ignore blocks of code? #734

Closed 4z3 closed 3 years ago

4z3 commented 3 years ago

Would it be possible to support ignoring certain blocks of code from getting reformatted? In particular when using https://github.com/webbhuset/elm-json-decode it would be desirable to keep their recommended format:

blogpost : Decoder BlogPost
blogpost =
    Field.require "title" Decode.string <| \title ->
    Field.requireAt ["author", "name"] Decode.string <| \authorName ->
    Field.require "content" Decode.string <| \content ->

    Decode.succeed
        { title = title
        , author = authorName
        , content = content
        }

The above will be reformatted to following, which is unacceptable especially when decoders becomes more complex:

blogpost : Decoder BlogPost
blogpost =
    Field.require "title" Decode.string <|
        \title ->
            Field.requireAt [ "author", "name" ] Decode.string <|
                \authorName ->
                    Field.require "content" Decode.string <|
                        \content ->
                            Decode.succeed
                                { title = title
                                , author = authorName
                                , content = content
                                }

At the moment I'm considering exempting some files from getting elm-formatted but this is rather suboptimal as the majority of the files' contents should adhere to elm-format. There are other suboptimal possibilities like splitting the code of decoders, etc.

avh4 commented 3 years ago

Thanks for opening the discussion! This is something I'm hesitant about but am open to considering. The last time this was discussed, there was not enough community usage of the nested lambda pattern to merit a change on the scale of introducing an formatting configuration (and specifically, intra-file configuration).

As a side-note, there is discussion about changing the formatting for nested lambda patterns here: https://github.com/avh4/elm-format/issues/352

If you (OP) or anyone else has time to contribute, more details about the following would be helpful in weighing the value of this kind of feature against its costs:

You noted you're aware of this already, but for others who might read this thread, FYI here are the current workarounds to ignore formatting for some code by having certain files or folders of files that are unformatted.

4z3 commented 3 years ago

As a side-note, there is discussion about changing the formatting for nested lambda patterns here: #352

Thanks for mentioning, my cursory search of the issues hasn't brought it up :) I think with that, we can close this issue as duplicate. Either way, I'm going to answer the points here:

specific details about why the current behavior is unacceptable and the current workarounds suboptimal for your particular use

The current behavior of indenting the nested lambdas renders the code unreadable, especially when dealing with lots of fields (too deep indentation causes wrapping of code in the editor) or e.g. when fields get added (hard to spot changes in diffs in reviews because "everything changes" due to indentation). In fact, I have not seen any good alternative approach in dealing with large, handwritten decoders. Of course, generated decoders are better when available and we do generate decoders using openapi-generator when swagger/openapi specs are available. But for example when writing decoders for configuration, no swagger is available and fields get added rather frequently.

some quantification of the impact would also be helpful (how often are such decoders written/read/edited, and how much time is spent dealing with the formatting, and has it led to bugs?)

I can, of course, only talk for myself, but a conservative estimation is "on a weekly basis", both reading and writing. To put that into numbers:

$ echo $(find $(ls -d */src) -name \*.elm -exec grep -H Json.Decode {} +|cut -d: -f1 | sort -u|wc -l) / $(find $(ls -d */src) -name \*.elm|wc -l) | bc
.39285714

So about 40% of the files I'm working with have a decoder, which, albeit often implemented using Json.Decode.mapX, are candidates for nested lambdas. YMMV

details about any benefits you've seen from using nested lambdas here over other approaches that don't need that kind of nesting

The main benefit are:

The typical structure is:

decoder =
    attempt "a" |> \maybeA ->
    attempt "b" |> \maybeB ->
    attempt "c" |> \maybeC ->
    ...
    let
        a =
            Maybe.withDefault ... maybeA

        b =
            Maybe.withDefault ... maybeB

        ...
    in
    succeed
        { a = a
        , b = b
        , c = c
        ...
        }

There might be better approaches doing decoders, but I am not aware of them.

would you always prefer webbhuset/elm-json-decode over other options if starting future projects?

As long as there is no neater approach to write concise decoders, yes, either elm-json-decode or similar (like (&>)) will be preferred in the future.

FYI here are the current workarounds to ignore formatting for some code by having certain files or folders of files that are unformatted

FWIW, as we do like elm-format in general but really do want nested lambdas in some places, we're currently experimenting with this crude wrapper:

#! /bin/sh
# usage: my-elm-format PATH...

temp=$(mktemp -t my-elm-format.XXXXXXXX)
trap 'rm "$temp"' EXIT

yes n | elm-format "$@" | sed -rn '
  /^This will overwrite the following files to use Elm.s preferred style:$/,/^This cannot be undone!/{
    s/^\s*(\S+)$/\1/p
  }
' |
while IFS= read -r path; do
  cat "$path" |
  sed -r '
    /^--elm-format-ignore-begin$/,/^--elm-format-ignore-end$/{
      s/^/--@/
    }
  ' |
  elm-format --stdin |
  sed -r '
    /^--@--elm-format-ignore-begin$/,/^--@--elm-format-ignore-end$/{
      s/^--@//
    }
  ' > "$temp"

  cp "$temp" "$path"
done

(Current version of that script as Gist)

Of course "native support" for nested lambdas or even ignoring blocks of code would be preferable, with a slight preference for the possibility of ignoring blocks of code or otherwise configuring the formatter because you never know if there won't be a better way to lay out code in the future, right? ;)