Closed georgefst closed 2 years ago
I think things like this are tough for pretty-simple
. It is hard to format absolutely everything in the best way possible.
In this particular case, it appears we try to make things slightly easier to read by putting spaces between thing that look like operators, but it does make the dates look different.
However, I don't particularly find things like 2020 - 08 - 22
and 2019 - 02 - 18 20 : 56 : 24.265489 UTC
hard to read. Also, I would assume most users of pretty-simple
would figure out what is going on here, and not be too surprised by this.
So I'm fine with how it works currently, but if you want to figure out some way to handle this case better, that would of course be welcome!
Yeah, I've thought about this a bit and actually it doesn't really bother me either. I can see the value of splitting on operators even without a space.
Although I am slightly curious about what logic the old printer was using in these cases.
A similar, though still very minor issue:
data A (s :: Symbol) = A Text
instance KnownSymbol s => Show (A s) where
show (A x) = "A @" <> show (typeRep @s) <> " " <> show x
λ> print $ A @"type" "val"
A @"type" "val"
λ> pPrint $ A @"type" "val"
A @ "type" "val"
This is admittedly an odd Show instance. But the extra space is still a little ugly.
It bothers me, because when searching for clues in pages of debug output involving many dates and times, anything that keeps it more compact and readable is appreciated.
I have been leaning back towards my opinion from the first comment i.e. we should change the parser to only ever split on spaces. I'd guess this actually simplifies the implementation, but I'm not familiar with that part of the codebase.
We'd do worse with stuff like 2+2
, treating it as a single unit and losing colour. But I think that's fine, especially since derived Show
outputs will always have spaces.
we should change the parser to only ever split on spaces
I suppose if we want to be able to work with compact JSON, for example, we may also want to special-case [
, {
, "
etc.
Another argument for changing this back somehow:
Prelude Text.Pretty.Simple Network.Socket> pPrint $ SockAddrInet 8000 $ tupleToHostAddress (192,168,0,1)
192.168 . 0.1 : 8000
Another case of too many spaces:
print uuid
pPrint uuid
Outputs:
a7ed86f7-7f2c-4be5-a760-46a3950c2abf
a7ed86f7- 7 f2c- 4 be5-a760- 46 a3950c2abf
7, 4 and 46 are surrounded with spaces and are in color.
Another example of this was brought up in #99: 1e-2
pretty-prints as 1.0 e- 2
and not 1.0e-2
.
And in #102: 0xbb
becomes 0 xbb
Looking at it, it's actually pretty hard to change things to only split on spaces without rewriting the parser from scratch in a way that would support backtracking. I've fixed it for hexadecimal numbers and scientific notation, but it still handles things like dates and IP addresses poorly.
The diff is here: https://github.com/cdepillabout/pretty-simple/compare/master...typedrat:fix-spaces
Looking at it, it's actually pretty hard to change things to only split on spaces without rewriting the parser from scratch in a way that would support backtracking. I've fixed it for hexadecimal numbers and scientific notation, but it still handles things like dates and IP addresses poorly.
Yes, I worried that may be the case. It seems that before #67, pretty-simple
was being pretty clever (albeit subtly so) to actually preserve spaces through the whole pipeline. And we all overlooked this fact during implementing and reviewing that PR.
In my view, the reason this issue has stalled for so long is that what we really need is a complete overhaul of the parser, perhaps rewriting it to use some established parsing library, in the same way that #67 rewrote the printer. And none of the current maintainers, myself included, have had the time. But perhaps I'm being too pessimistic.
I'm curious about whether @cdepillabout has the same view, or any estimate on how big a task this is? It was my impression that the parser isn't that complex. Although we need to be careful about laziness.
In my view, the reason this issue has stalled for so long is that what we really need is a complete overhaul of the parser, perhaps rewriting it to use some established parsing library, in the same way that https://github.com/cdepillabout/pretty-simple/pull/67 rewrote the printer. And none of the current maintainers, myself included, have had the time. But perhaps I'm being too pessimistic.
It was my impression that the parser isn't that complex. Although we need to be careful about laziness.
This is basically my same exact feeling. It'd be great if someone wanted to take a shot at this. It really shouldn't be that difficult (like you say, the parser isn't that complex), but it is not completely straightforward. You can't just use an off-the-shelf parser like megaparsec, since in pretty-simple parsing should never "fail", and the whole process needs to be lazy.
Also, like I said in https://github.com/cdepillabout/pretty-simple/issues/75#issuecomment-678574420, I feel like pretty-simple is really just a bunch of heuristics. It tries its best to print out things in a reasonable manner, but not everything is going to look 100% perfect.
Most of the things that have been brought up in this thread (dates, ip addresses, uuids) don't have "valid" Show
instances, so it seems like most users are forgiving of pretty-simple for not formatting them perfectly.
This should be fixed by #105.
If anyone here wants to try out that branch on their projects and see if there are any regressions, that'd be great. All of the examples from this thread have been added as test cases.
67 broke
pPrintString "2020-08-22"
, along with other "broken" Show output from thetime
package.Previous output:
Now:
The printer receives:
I'd say it actually does the right thing, since for most combinations of
Expr
we do want to insert spaces. So I'd class this as an error in the parser, which should instead just return [Other "2020-08-22"]. But this is perhaps a matter of opinion. We would lose syntax highlighting, which I'm fine with, since this isn't an actual numeric literal.PS. I kept the example simple for the sake of discussion. To demonstrate that this really is an issue consider the readability of
2019 - 02 - 18 20 : 56 : 24.265489 UTC
.