tweag / ormolu

A formatter for Haskell source code
https://ormolu-live.tweag.io
Other
956 stars 83 forks source link

Layout is sometimes indented less than the line of the layout trigger #712

Closed cdsmith closed 3 years ago

cdsmith commented 3 years ago

Ormolu currently formats this way:

foo | x <-
        5 = case x of
  5 -> True
  _ -> False

The combination of the extra indent from wrapping the guard, and the case layout to the left of that indent, is very difficult to read. Lots of options would be better, including:

  1. Always Indenting a layout context at least as far as the indent on the line where the layout trigger occurs.
  2. Insisting on a newline after the =, so that this doesn't arise in the first place.

I don't have strong preference between the two, but the current behavior was a real WTF moment.

mrkkrp commented 3 years ago

What was the input that produced this output?

cdsmith commented 3 years ago

I don't recall the exact original input. Obviously x was a much longer pattern (hence the newline), and the case branches were indented further. Something like this:

foo | Some (Really Long (Pattern x) Goes Here) <-
      someOther really long expression goes here = case x of
        5 -> True
        _ -> False
mrkkrp commented 3 years ago

Thinking of what the output should be, I think I don't mind that the second line is shifted more than the rest, but maybe it should just be shifted by one indentation step compared to the body of case:

foo | x <-
    5 = case x of
  5 -> True
  _ -> False
tbagrel1 commented 3 years ago

I was also thinking of what could be the output

I thought of

foo
  | x <- 5 = case x of
    5 -> True
    _ -> False

(which I prefer)

or

foo | x <-
        5 =
  case x of
    5 -> True
    _ -> False

The first one gives more room to the long pattern and expression, and limits diffs if just the function name change.

mrkkrp commented 3 years ago

We can't really do arbitrary formatting (well, we can, but it is harder and more error-prone). If a construction has a multiline span it must preferably stay multiline, similarly single line must stay single line. So, since it is x <- y that is split in this case, the split should happen in the same place. Ignoring this rule usually leads to subtle problems as certain things start having the potential of switching from single-line to multi-line configurations (and vice versa) on subsequent runs of the tool, hence idempotence is compromised.

tbagrel1 commented 3 years ago

Given your layout @mrkkrp, should the 2nd line be indented with 2 levels, or indented to match the pipe (I'm not sure, because in the current example, both corresponds to 4 spaces)?

I'm not really convinced that we can provide a satisfying fix for @cdsmith here, because if a line break is introduced in the <- bind expression by the user, it means that the left side and/or the right side is very wide.

tbagrel1 commented 3 years ago

@mrkkrp, I managed to spot where this "issue" is located in the code, but now I have more questions :)

foo | x <-
    5 = case x of
  5 -> True
  _ -> False

or this (2):

foo | x <-
  5 = case x of
    5 -> True
    _ -> False

?

foo | x <-
    5,
      y <-
    6 = case x of
  5 -> True
  _ -> False
mrkkrp commented 3 years ago

It looks like currently the length of the function name determines indentation. This is not ideal. I think that the fact that the guard is multiline means that we could insert a newline (only when the guard is multiline) in front of | and this will be safe wrt idempotence:

foobar
  | x <-
      5,
    y <-
      6 = case x of
  5 -> True
  _ -> False

This doesn't look particularly good to me, but it is a strict improvement wtf to minimal diffs.

mrkkrp commented 3 years ago

Indentations above all should be done with the inci (increase indentation) combinator. The pipe is one level indented (2 spaces), then x and y are two-levels indented, 5 and 6 are three levels indented.

tbagrel1 commented 3 years ago

In that case, shouldn't we indent the 5 -> True and _ -> False one level more?

mrkkrp commented 3 years ago

Thinking about the case with several guards, I think it makes sense to intent that part one level more, good catch :+1:

tbagrel1 commented 3 years ago

Is this correct?

Should we break line before |?

Multiple guards -> yes (already implemented)

Single guard:

1 pattern bind 2+ pattern binds
All single line no yes
At least one multiline yes yes

We can note that when there are multiple guards, ormolu seems to do what we want already:

foobarbar :: Int -> Bool
foobarbar | x <-
    5,
    y <-
    6 = case x of
  5 -> True
  _ -> False
          | otherwise = True

becomes

foobarbar :: Int -> Bool
foobarbar
  | x <-
      5,
    y <-
      6 = case x of
    5 -> True
    _ -> False
  | otherwise = True

(However, if the 2nd pattern bind is single-line for example, ormolu currently produces this:

foobarbar :: Int -> Bool
foobarbar
  | x <-
      5,
    y <- 6 = case x of
    5 -> True
    _ -> False
  | otherwise = True

)

So in fact, we would like to treat 1 multi-line pattern bind, or multiple pattern binds (either single or multi-line) the same way we treat multiple guards, am I correct?

Should we also enforce that all pattern binds be multiline if one of them are (to improve the result of the last example)? Or maybe should we just change the indent of the case arms in this last example?

mrkkrp commented 3 years ago

So in fact, we would like to treat 1 multi-line pattern bind, or multiple pattern binds (either single or multi-line) the same way we treat multiple guards, am I correct?

Yes.

Should we also enforce that all pattern binds be multiline if one of them are (to improve the result of the last example)? Or maybe should we just change the indent of the case arms in this last example?

Indent should be increased one level for the body of case, but y <- 6 should stay single line and should not be artificially made multiline.

tbagrel1 commented 3 years ago

But if

Indent should be increased one level for the body of case

Then we will have a difference of indentation when y <- 6 is single line, and when it is multiline

mrkkrp commented 3 years ago

Body of case should be indented more only when we put | on its own line. Single vs multi line for y <- 6 should make no difference.