cdepillabout / pretty-simple

pretty-printer for Haskell data types that have a Show instance
https://hackage.haskell.org/package/pretty-simple
BSD 3-Clause "New" or "Revised" License
243 stars 29 forks source link

indentation in pretty-printed strings #24

Closed Wizek closed 6 years ago

Wizek commented 6 years ago

Current output:

( 1
, ( 2
  , Here is some show output
that has newlines
in it
  , 3
  )
)

Desired output:

( 1
, ( 2
  , Here is some custom show output
    that has newlines
    in it
  , 3
  )
)

Can be reproduced by e.g.

data Foo = Foo
instance Show Foo where show _ = "foo\nbar\nbaz"
main = pPrint (1, (2, Foo, 3))

Implementing this may be as simple as

unlines . ("  " <>) . lines

What do you say, @cdepillabout?

p.s. I love your library.

p.p.s I've just noticed that even strings with newlines seem to cause the same issue: pPrint (1, (2, "foo\nbar\nbaz", 3)). My proposed solution above would solve this case as well.

p.p.p.s In case you agree with my 'prognosis' and proposed 'treatment', would you prefer implementing the fix or would you like if I submitted a pull request?

cdepillabout commented 6 years ago

@Wizek

I agree that for short strings,

( 1
, ( 2
  , Here is some custom show output
    that has newlines
    in it
  , 3
  )
)

looks better than

( 1
, ( 2
  , Here is some show output
that has newlines
in it
  , 3
  )
)

However, for longer strings (for instance, strings that are over 120 characters and will wrap on a monitor), I'm not yet convinced that indenting after new lines will necessarily produce the easiest-to-read output.

Unfortunately, I don't have time to fix this, and no one else has posted about it, so if you want to attempt the solution described in your comment, please feel free. I would be able to review any PRs.

Please be warned that the code for doing the output is not very good. Ideally, it would use something like a state monad for keeping track of the output state (like done in hindent). Right now it is basically being done by hand.

Wizek commented 6 years ago

Glad to read that you are open to a PR. I will look into if I can implement this. I have a hunch that it will be simpler to do so than you imagine, e.g. no state tracking will be required. This is where the magic happens, isn't it?


As for longer lines: That's an interesting question that I hadn't considered. I've mainly ran into this with shorter lines that have newlines in them so far. Here are a few ideas that came to me how we could tackle those too:

1) Leave them be while indenting. Which will wrap like this:

```hs
( 1                                       |
, ( 2                                     |
  , Here is some custom show output ......|
...................                       |
    that has newlines ....................|
.........                                 |
    in it and long lines too              |
  , 3                                     |
  )                                       |
)                                         |
```

Some people may be able to set their terminals not to word-wrap in this case, and allowing them to scroll sideways. Maybe `tmux` can change this on the fly with a key combo. It's also possible with IIRC `less +S`, making it look like:

```hs
( 1                                       |
, ( 2                                     |
  , Here is some custom show output ......|
    that has newlines ....................|
    in it and long lines too              |
  , 3                                     |
  )                                       |
)                                         |
```

1) Set a character limit, and word wrap ourselves based on that. Perhaps using a special unicode char indicating that we did so:

```hs
( 1                                       |
, ( 2                                     |
  , Here is some custom show output ..⋱   |
    .......................               |
    that has newlines ................⋱   |
    .............                         |
    in it and long lines too              |
  , 3                                     |
  )                                       |
)                                         |
```

Info: https://www.fileformat.info/info/unicode/char/22F1/index.htm

1) For Strings we may also consider something like

```hs
( 1                                          |
, ( 2                                        |
  , ""                                       |
    <> "Here is some custom show output .."  |
    <> ".......................\n"           |
    <> "that has newlines ................"  |
    <> ".............\n"                     |
    <> "in it and long lines too             |
  , 3                                        |
  )                                          |
)                                            |
```

And whichever we choose from this, while having a sensible default, it may be configured with OutputOptions.

For a start I think it makes sense to implement nr. 1 as that seems to be the simplest.

cdepillabout commented 6 years ago

I think this case will have to be modified. However I don't know how easy it will be to figure out how much to indent each subsequent line.


I agree that your 1. is probably the easiest to implement, and the easiest to understand for the end user.

Wizek commented 6 years ago

Yes, since then I've found that case as well, attempting to modify it currently.

I'm also writing a new doctest for this, running it with stack test. Would you happen to know if I can somehow speed up the feedback cycle between file changed and tests ran? I am usually able to do that with ghcid and hspec to be sub-1 second. Is something similar possible with doctest?

Wizek commented 6 years ago

Some progress so far: https://github.com/Wizek/pretty-simple/commit/4a26a538f6

Currently failing with

### Failure in src/Text/Pretty/Simple.hs:209: expression `pPrintOpt cfg (1, (2, "foo\nbar\nbaz"), 3)'
expected: ( 1
          , ( 2
            , "foo
              bar
              baz"
            , 3
            )
          )
 but got: ( 1
          , 
              ( 2
              , "foo
            bar
            baz"
              ) 
          , 3
          ) 
Examples: 52  Tried: 52  Errors: 0  Failures: 1

I have a suspicion that this is due to Output (OutputIndent) being processed as a flat structure instead of as a nested one. So maybe we need to do this indenting earlier in the pipeline where Output is produced, where the nested structure is still present.

Wizek commented 6 years ago

I've found out about outputOptionsIndentAmount, which gets us very close: https://github.com/Wizek/pretty-simple/commit/85df411

### Failure in src/Text/Pretty/Simple.hs:209: expression `pPrintOpt cfg (1, (2, "foo\nbar\nbaz"), 3)'
expected: ( 1
          , ( 2
            , "foo
              bar
              baz"
            , 3
            )
          )
 but got: ( 1
          , 
              ( 2
              , "foo
                bar
                baz"
              ) 
          , 3
          ) 
Examples: 52  Tried: 52  Errors: 0  Failures: 1

I wonder where that extra rogue newline comes from.


Nevermind, it was just some mistaken assumption on my part how the current rendering looked.

I've pushed some new commits which I believe might just implement this feature entirely; sending a PR shortly.

cdepillabout commented 6 years ago

Is something similar possible with doctest?

I think you should be able to call doctest directly (instead of letting stack run it for you). If you end up doing more work on pretty-simple and want to use this, feel free to ask about it.