haskell / pretty

Haskell Pretty-printer library
Other
70 stars 31 forks source link

Support alternative textual representations #42

Open adinapoli opened 7 years ago

adinapoli commented 7 years ago

Hey @dterei !

As the title suggests, this is yet another attempt to generalise pretty to support alternative textual representations, such that it would pave the way for #1 .

Now, I know that a lot has been written in the past, but I'm giving it another go mainly as a means for me to write down an experience report of two days of tinkering which led nowhere.

Introduction

The bigger goal is to replace GHC's copy of pretty with this library, but to the best of my knowledge there are two main blockers:

  1. GHC uses some ad-hoc textual representations for performance reasons. This doesn't go along well with pretty's user-facing API, which exposes combinators like text (see below).

  2. There are a couple of commits by @thomie which have been reverted due to some allocation regression in the compiler.

I think 2. is orthogonal to the scope of this ticket (but certainly relevant to the bigger goal), so I'm going to focus on 1. only.

In this ticket, @bgamari summarise the issue quite neatly:

The other related issue is the inability to make use of FastString with the upstream pretty since the text combinator has type String -> Doc. This is actually a very poor interface as text needs the length of the string, which is an O(n) operation. Really it should at least allow usage of a more efficient string representation. niteria and I discussed this a few months ago and agreed that this could be resolved by simply generalizing text over its string type with a typeclass providing an unpack and length function.

Starting from this ticket, I have started exploring different options, but none of my designs seemed to lead me somewhere. Therefore I'm seeking help from you fine hackers about how to proceed. Hereby follows a couple of failed attempts with a streams of thoughts of what made me pick certain decisions over something else. I'm relying on memory here, so please do not hold anything against me if some of these ramblings are incomplete or imprecise 😉

Attempt 1: Start top-down from the combinators

I started by doing exactly what Ben suggested; I created a type class like this (naming is hard, I picked this name just to avoid confusion with other existing textual types):

class RuneSequence a where
    len :: a -> Int
    unpack :: a -> String

This of course allows us to generalise things like text like the following:

text :: RuneSequence r => r -> Doc a
text s = case len s of {sl -> textBeside_ (NoAnnot (Str s) sl) Empty}

This won't compile though as it calls internally textBeside_ which relies on TextDetails and the latter "leaks" its internals, by which I mean the Str constructor, which expects a String. One could argue I could then use unpack to write the following:

text :: RuneSequence r => r -> Doc a
text s = case len s of {sl -> textBeside_ (NoAnnot (Str $ unpack s) sl) Empty}

But this left a sour taste in my mouth:

  1. unpack can be costly, especially if we do lots of it. I really don't want to make the performance of the library any worse;

  2. It really doesn't solve the crux of the problem: If we compare GHC's TextDetails, we will see it's defined like this:

data TextDetails = Chr  {-# UNPACK #-} !Char -- ^ A single Char fragment
                 | Str  String -- ^ A whole String fragment
                 | PStr FastString                      -- a hashed string
                 | ZStr FastZString                     -- a z-encoded string
                 | LStr {-# UNPACK #-} !LitString {-#UNPACK #-} !Int
                   -- a '\0'-terminated array of bytes

In order to have a chance of unifying the two libraries, one possibility I saw was to abstract away TextDetails, which led me to the second bikeshed:

Attempt 2: Polymorphic AnnotDetails

I asked myself "Can I abstract away TextDetails altogether, so that user code can plug its own TextDetails"? In brief, write the following:

data AnnotDetails r a = AnnotStart
                    | NoAnnot !r {-# UNPACK #-} !Int
                    | AnnotEnd a
                      deriving (Show,Eq)

I have one big problem with this and with all its variations: it will make the r parameter to bubble up in the Doc type, to the point which it will become Doc r a. You might argue this could read like "This is a Doc node annotated with an annotation of type a and an inner textual representation of type r". This though introduces a massive breaking change in the API (no good) and I suspect we'll still need to be rely on type classes anyway (like RuneSequence or similar) as most of the combinators are using the constructors of TextDetails anyway. So, up to the next.

Attempt 3: Add a new constructor to TextDetails

In brief, write the following:

data TextDetails r = Chr  {-# UNPACK #-} !Char
                 | Str  String
                 | ...
                 | Runes r                     -- user provided

This might reconcile the two worlds, but it has several problems as well:

  1. It leaks a new type param r like solution 2, so it's a no-go
  2. Won't save us from using type classes anyway

I have tried to remove the need for the extra type param with a RankNType annotation like this:

data TextDetails r = Chr  {-# UNPACK #-} !Char
                 | Str  String
                 | ...
                 | Runes (forall r. RuneSequence r => r)                    -- user provided

Although this might work (but I doubt) I couldn't use it as TextDetails needs to derive Show, Eq and Generic. The first two can be derived via StandaloneDeriving, but the third cannot, and we cannot write it manually by hand due to the Safe constraint we have on the package. Argh!

At this point, I felt like I have reached an impasse. Thanks for reading up to this point, and hopefully somebody can chime in and tell me if there is a simple solution to this riddle 😉

Alfredo

bgamari commented 7 years ago

I can see two options here.

Quantify over the string type

That is,

data TextDetails = Chr  {-# UNPACK #-} !Char -- ^ A single Char fragment
                 | forall r. RuneSequence r => Str r  -- ^ A whole string fragment

Under this scheme Str encodes two pointers: one to the RuneSequence r dictionary and another to the r itself.

Use laziness to defer projecting to a String

That is,

data TextDetails = Chr  {-# UNPACK #-} !Char -- ^ A single Char fragment
                 | Str Int String -- ^ A whole String fragment

str :: RuneSequence r => r -> TextDetails
str x = Str (len x) (unpack x)

Under this scheme Str encodes two pointers: each to a closure to the respective RuneSequence method applied to x.

adinapoli commented 7 years ago

Hey Ben,

thanks for reading my rumination up until the end!

Quantify over the string type

I would have loved to do this, but this looks similar to the approach I have tried already (I have omitted this, but I have indeed tried both the RankNType variant Str (forall r. RuneSequence r => r) and the ExistentialQuantification one, like forall r. RuneSequence r => Str r. The problem are always the derived instances, as this version of pretty (unlike the GHC fork) derives Show, Eq and Generic. As I have said above, I can generate the first two via StandaloneDeriving, but the dealbreaker is the generic one 😢 Trying to do so will yield:

Can't make a derived instance of ‘Generic TextDetails’:
      Str must be a vanilla data constructor
    In the stand-alone deriving instance for ‘Generic TextDetails’

and I cannot generate this manually due to the Safe PRAGMA.

Use laziness to defer projecting to a String

Ah, this is interesting instead! If I've got you you are saying, in other terms, that we might still be able to get some speedup thanks to the deliberate use of a more efficient len implementation (like the O(1) provided by FastString) but push down any real "rendering" work at the very last time, namely in the guts of a Doc, when we perform the rendering itself and force the nodes of the AnnotDetails and thus TextDetails. Am I right?

I think that looks plausible! I would be interested to see the memory tradeoff we will have to make here (although Str is already lazy in the current version, so I wouldn't expect any regression in terms of thunks piling up). I guess the best way is to try and see, right? 😉

dterei commented 7 years ago

@adinapoli Great work! Thanks for such a nice write up :) Sorry for the delay in replying here. I'm currently on holiday for a few weeks and travelling so sporadic time for "work" stuff.

Let's step back a moment. I think there are two ways to think about this at a high-level before we bring in the mechanisms that Haskell gives us.

Approach 1: Abstract the base storage from

We need some way to store actual strings and characters. Right now, we use two concrete types for that, String and Char. The problem is that this is believed to be inefficient. Which leads us to a few questions to help guide us:

  1. Why are String and Char inefficient? It seems only String is inefficient, not Char and I like that you've potentially identified why. The call to length, which is O(n), in the functions text and ptext.
  2. How bad is the performance compared to say FastString?

I'd be curious to know once we had a benchmark, how well String could perform if we just added the following method: text_efficient :: String -> Int -> Doc a, where the Int was the length of the string to avoid needing to call length itself. Not saying this is a solution, but if it performs as well as a more efficient representation like FastString, at least we know where the fundamental issue is.

Then, the approach (which is what you've tried so far) is to move away from String to an abstracted base storage type that the client of the library can control. Can use TypeClasses to do this, or potentially just functions which define the right interface to capture all the information we need (which I believe is just the length of the string...).

I've in the past played with using a typeclass to abstract the storage, but didn't continue down this pathway after Duncan objected (and two Simon's agreed) on the grounds that we already have a way to somewhat abstract this through the fullRender function taking a function of type TextDetails -> a -> a. So even if String is inefficient, you can hopefully avoid this through lazy evaluation / streaming and using fullRender.

Typeclasses are obviously as you saw expensive since they push outwards to the clients of the pretty library and greatly complicate the interface.

TextDetails doesn't store the length of the string, but AnnotDetails does... So maybe the challenge is just providing a more efficient way for a client to construct an AnnotDetails than us calling O(n) length.

Approach 2: Single, efficient storage type

Can we keep a single concrete storage type but move to one that is more efficient than String? This is essentially what GHC does with FastString. Do users really want to control the storage type of pretty? I doubt it, especially as fullRender already provides a lot of flexibility.

So, what is the interface a fast string type needs to support for pretty to work well? And how do users of the pretty library construct these efficiently too? E.g., it's no good if we just push the performance problem one layer up! :)

adinapoli commented 7 years ago

Hey David, thanks for the great feedback(s)! Getting quite late over here so I beg you a pardon if the synapses won't work as they should. Let's start with the motivation, as you asked for it in my PR: The TL;DR is that I have embarked in an ambitious mission in trying to improve GHC compilation times. After fiddling a bit with a prof version of GHC, and running the stage-2 compiler with +RTS -p enabled, I have noticed how 10% of the time was spent emitting (or I should say, pretty printing) ASM code. One thing led to another and I have quickly learnt I'm not the first trying to unify the two "pretty", GHC internal copy and this package. Ben suggested (in that ticket I have linked above) that one blocker is the String-based interface, so this explains the context behind this ticket.

I think you raised very good points, so I'm going to split this reply in 2 sections; one comments benchmarking (and the need for it?), the other the two approaches you describe.

On benchmarking

Summary: I would like to create benchmarks, too, provided they can be faithful to the real world, and as close as possible in replicating what GHC does. I have no idea how to gather such examples.

I think by now it's cropping up quite the legitimate question "Why is String slow?" and "Why do we need to improve this?". The short answer is that I'm simply assuming (perhaps erroneously) that the benchmarking was done (or at least some kind of analysis) by people way smarter than me and that worked on GHC for a long time. I'm definitely on board in writing criterion benches for pretty, although my concern is to how replicate an effective use case (or ideally more than one). The risk of creating synthetic examples which are nowhere near the real world usage looks very high, and I'm asking for the help of the community here. I have seen around repos with test cases for the different pretty-printing libraries, but I have no idea how faithful they are, especially in terms of what GHC does, which is the main reason I'm doing this work. In general, just from a computational complexity point of view, the bottleneck being the O(n) length makes complete sense, although I have no evidence to prove it.

On the different abstraction approaches

Both have advantages and disadvantages (everything is a trade-off, right? hehe).