Open dterei opened 12 years ago
Isn't GHC using upstream pretty
by now?
No, when I started on this it highlighted a choice that you can make in pretty that is ambiguous by the laws pretty uses. GHC makes the choice one way and external pretty makes it the other. That blocked me at the time and now the internal GHC pretty has diverged further I believe. Not hard to resolve, but it isn't.
GHC moved away from using literate haskell recently, the module has moved here.
Is the only thing blocking GHC adoption of the external pretty library someone doing the work?
No also blocking is a slight difference in behavior. There is one situation where the laws for pretty are ambiguous and leave room for choice. GHC decided one way and pretty the other.
I may just close this ticket as the downside of ghc having its own copy is very small
On Mar 10, 2015, at 11:03 PM, Trevor Elliott notifications@github.com wrote:
GHC moved away from using literate haskell recently, the module has moved here.
— Reply to this email directly or view it on GitHub.
Ah, dang. I was hoping to extend the GHC pretty printer with the annotations, to get functionality like in the idris repl. Maybe extending their forked version with the annotations is the right path forward for that work :)
Well if you're going to be putting effort into the pretty fork in GHC, then it may make sense to try to unify the two. It's an easy attempt, just try using the pretty library instead of GHC's forked version and see if any GHC output breaks and why.
Especially as pretty now has support for annotations in HEAD.
Yes, I was reluctant to start until the package had been released, hence #21.
I am working on making GHC's compiler/utils/Pretty.hs
as similar as possible to src/Text/PrettyPrint/HughesPJ.hs
from pretty-1.1.2.1 (before the annotation changes). This is a first step towards reconcillation between the two.
Edit: see https://ghc.haskell.org/trac/ghc/ticket/10735
Some obvious differences:
FastString
, which is supposedly necessary for performanceEdit: more differences, GHC's version defines:
quotes p = char '`' <> p <> char '\''
rational n = text (show (fromRat n :: Double))
Great! Thanks for doing this thomie.
@thomie What's the status of your work? Is the only difference now with pretty
and ghc
the use of FastString
?
Status: I don't know how to proceed.
ghc
's version resembles pretty-1.1.2.0
. To see the differences, do this in a ghc git tree:
$ cd libraries/pretty
$ git checkout v1.1.2.0
$ cd ../../
$ vimdiff compiler/utils/Pretty.hs libraries/pretty/src/Text/PrettyPrint/HughesPJ.hs
Notable differences:
TextDetails
uses Faststring
Other difference are minor. Both copies define some extra functions and instances not defined in the other copy.
GHC performance is very sensitive to Pretty changes. That FastString
hack is absolutely necessary.
As mentioned in https://ghc.haskell.org/trac/ghc/ticket/10735#comment:23, for parity with pretty-1.1.2.1
the following two commits would also be needed. I did not land them, because they cause more allocation in the compiler (see discussion in #9):
After that come the Annotated/
changes, which is where I stopped.
OK, thanks for the update. Would supporting ByteString
in Pretty be sufficient you think to provide the performance GHC needs?
Given FastString
is just a ByteString
with a hash associated with it for faster Eq
, seems like that would be fine.
obvious side remark (but it's really a separate issue): is this "ByteString with a hash associated with it for faster Eq" worthy of becoming a stand-alone library?
I take it that Eq
is not used for "pretty" but I can think of several other use cases. But then, ByteString
has O(1) slicing (take, drop), and this would not work with the hashes.
Maybe, if it doesn't exist someone can just create it on Hackage, not really much point discussing it here. The alternative would be to pull FastString
out of GHC into it's own library, which is a discussion for ghc-dev mailing list. I think that wouldn't be worth while though.
Would supporting ByteString in Pretty be sufficient you think to provide the performance GHC needs?
I guess so.
@bgamari has been toying with the idea of using a different pretty printing library than Pretty (https://ghc.haskell.org/trac/ghc/ticket/8809#comment:21), you might want to discuss with him.
Maybe switching to a different library would give better performance also?
GHC performance is very sensitive to Pretty changes. That FastString hack is absolutely necessary.
@thomie: Do you understand why that is? My intuition fails me here, I'd expect pretty printing speed to not matter in practice.
@niteria: you can probably ask @simonmar more about it, I know he's aware of the problem.
Subtle performance issues I've encountered:
My guess: there is a simply a lot of code to prettyprint (cmm, assembly, llvm). Combine that with nobody really understanding the HughesPJ algorithm or the Pretty source code, and it is very easy to introduce an "accidentally quadratic".
The Pretty
library runs in two modes: the normal mode where it is nicely indenting things, and a fast mode where it doesn't indent anything. The fast mode is used for generating assembly and LLVM, and we really really care about it being fast.
The fast way should probably use a ByteString
builder instead of the current hack, actually.
Thank you @thomie and @simonmar for context! I profiled GHC while compiling one of the pretty
files and pprNativeCode
is 13% of allocations and reduceDoc
which pprNativeCode
uses through bufLeftRender
is 6.7%. It seems like we can do better here, especially since the output format doesn't have to be pretty. I've tried to get rid of reduceDoc
in this case, but different kinds of empty Doc
s make it difficult and I don't understand all the semantics yet to have anything correct.
I'm currently looking at modifying the Doc
type to take an extra parameter, so that TextDetails
can be swapped out for ByteString
or Text
. I've got the basics of the abstraction together, I just need to get the pragmas in the right pace to inline / specialize things, and then check the core that comes out of that and some benchmarks.
GHC has an internal copy of pretty. I've gone and merged any improvements from GHC into pretty, so we can be safe that the code in pretty is the best code. However GHC uses some different base types for performance. We need to offer better base types than String. Say ByteString and Text and also offer a builder approach to the render function for performance. After that improvement should be able to change GHC to use the pretty library.