haskell / pretty

Haskell Pretty-printer library
Other
69 stars 30 forks source link

Replace GHC copy #1

Open dterei opened 12 years ago

dterei commented 12 years ago

GHC has an internal copy of pretty. I've gone and merged any improvements from GHC into pretty, so we can be safe that the code in pretty is the best code. However GHC uses some different base types for performance. We need to offer better base types than String. Say ByteString and Text and also offer a builder approach to the render function for performance. After that improvement should be able to change GHC to use the pretty library.

hvr commented 10 years ago

Isn't GHC using upstream pretty by now?

dterei commented 10 years ago

No, when I started on this it highlighted a choice that you can make in pretty that is ambiguous by the laws pretty uses. GHC makes the choice one way and external pretty makes it the other. That blocked me at the time and now the internal GHC pretty has diverged further I believe. Not hard to resolve, but it isn't.

thomie commented 10 years ago

For future reference: GHC's internal copy of pretty.

elliottt commented 9 years ago

GHC moved away from using literate haskell recently, the module has moved here.

elliottt commented 9 years ago

Is the only thing blocking GHC adoption of the external pretty library someone doing the work?

dterei commented 9 years ago

No also blocking is a slight difference in behavior. There is one situation where the laws for pretty are ambiguous and leave room for choice. GHC decided one way and pretty the other.

I may just close this ticket as the downside of ghc having its own copy is very small

On Mar 10, 2015, at 11:03 PM, Trevor Elliott notifications@github.com wrote:

GHC moved away from using literate haskell recently, the module has moved here.

— Reply to this email directly or view it on GitHub.

elliottt commented 9 years ago

Ah, dang. I was hoping to extend the GHC pretty printer with the annotations, to get functionality like in the idris repl. Maybe extending their forked version with the annotations is the right path forward for that work :)

dterei commented 9 years ago

Well if you're going to be putting effort into the pretty fork in GHC, then it may make sense to try to unify the two. It's an easy attempt, just try using the pretty library instead of GHC's forked version and see if any GHC output breaks and why.

Especially as pretty now has support for annotations in HEAD.

elliottt commented 9 years ago

Yes, I was reluctant to start until the package had been released, hence #21.

thomie commented 9 years ago

I am working on making GHC's compiler/utils/Pretty.hs as similar as possible to src/Text/PrettyPrint/HughesPJ.hs from pretty-1.1.2.1 (before the annotation changes). This is a first step towards reconcillation between the two.

Edit: see https://ghc.haskell.org/trac/ghc/ticket/10735

Some obvious differences:

Edit: more differences, GHC's version defines:

quotes p       = char '`' <> p <> char '\''
rational n = text (show (fromRat n :: Double))
dterei commented 9 years ago

Great! Thanks for doing this thomie.

dterei commented 8 years ago

@thomie What's the status of your work? Is the only difference now with pretty and ghc the use of FastString?

thomie commented 8 years ago

Status: I don't know how to proceed.

ghc's version resembles pretty-1.1.2.0. To see the differences, do this in a ghc git tree:

$ cd libraries/pretty
$ git checkout v1.1.2.0
$ cd ../../
$ vimdiff compiler/utils/Pretty.hs libraries/pretty/src/Text/PrettyPrint/HughesPJ.hs

Notable differences:

Other difference are minor. Both copies define some extra functions and instances not defined in the other copy.

GHC performance is very sensitive to Pretty changes. That FastString hack is absolutely necessary.

As mentioned in https://ghc.haskell.org/trac/ghc/ticket/10735#comment:23, for parity with pretty-1.1.2.1 the following two commits would also be needed. I did not land them, because they cause more allocation in the compiler (see discussion in ​​#9):

After that come the Annotated/ changes, which is where I stopped.

dterei commented 8 years ago

OK, thanks for the update. Would supporting ByteString in Pretty be sufficient you think to provide the performance GHC needs?

dterei commented 8 years ago

Given FastString is just a ByteString with a hash associated with it for faster Eq, seems like that would be fine.

jwaldmann commented 8 years ago

obvious side remark (but it's really a separate issue): is this "ByteString with a hash associated with it for faster Eq" worthy of becoming a stand-alone library?

I take it that Eq is not used for "pretty" but I can think of several other use cases. But then, ByteString has O(1) slicing (take, drop), and this would not work with the hashes.

dterei commented 8 years ago

Maybe, if it doesn't exist someone can just create it on Hackage, not really much point discussing it here. The alternative would be to pull FastString out of GHC into it's own library, which is a discussion for ghc-dev mailing list. I think that wouldn't be worth while though.

thomie commented 8 years ago

Would supporting ByteString in Pretty be sufficient you think to provide the performance GHC needs?

I guess so.

@bgamari has been toying with the idea of using a different pretty printing library than Pretty (https://ghc.haskell.org/trac/ghc/ticket/8809#comment:21), you might want to discuss with him.

Maybe switching to a different library would give better performance also?

niteria commented 7 years ago

GHC performance is very sensitive to Pretty changes. That FastString hack is absolutely necessary.

@thomie: Do you understand why that is? My intuition fails me here, I'd expect pretty printing speed to not matter in practice.

thomie commented 7 years ago

@niteria: you can probably ask @simonmar more about it, I know he's aware of the problem.

Subtle performance issues I've encountered:

My guess: there is a simply a lot of code to prettyprint (cmm, assembly, llvm). Combine that with nobody really understanding the HughesPJ algorithm or the Pretty source code, and it is very easy to introduce an "accidentally quadratic".

simonmar commented 7 years ago

The Pretty library runs in two modes: the normal mode where it is nicely indenting things, and a fast mode where it doesn't indent anything. The fast mode is used for generating assembly and LLVM, and we really really care about it being fast.

The fast way should probably use a ByteString builder instead of the current hack, actually.

niteria commented 7 years ago

Thank you @thomie and @simonmar for context! I profiled GHC while compiling one of the pretty files and pprNativeCode is 13% of allocations and reduceDoc which pprNativeCode uses through bufLeftRender is 6.7%. It seems like we can do better here, especially since the output format doesn't have to be pretty. I've tried to get rid of reduceDoc in this case, but different kinds of empty Docs make it difficult and I don't understand all the semantics yet to have anything correct.

dalaing commented 7 years ago

I'm currently looking at modifying the Doc type to take an extra parameter, so that TextDetails can be swapped out for ByteString or Text. I've got the basics of the abstraction together, I just need to get the pragmas in the right pace to inline / specialize things, and then check the core that comes out of that and some benchmarks.