haskell-boxes / boxes

A pretty-printing library for laying out text in two dimensions, using a simple box model.
Other
37 stars 13 forks source link

Data.Text conversion #1

Open rustydc opened 12 years ago

rustydc commented 12 years ago

Hey Eelis,

I just did a pretty brute-force conversion of String to Data.Text, and when the typechecker stopped complaining at me it seemed to still work for my use cases.

I'm not sure if it makes sense to convert the whole library like I did, or add a Text interface, etc., but here's the code in case it's useful to you.

-Rusty

treeowl commented 9 years ago

This obviously won't work as is; it will break any code using the package. I do, however, think we should refactor things to allow boxes to be used with Text.

hdgarrood commented 8 years ago

@treeowl How would you feel about something like this?

data BoxOf a = { ... , content :: ContentOf a }
data ContentOf a = Blank | Text a | Row [BoxOf a] | ...

type Box = BoxOf String
type Content = ContentOf String

We'd need to figure something out about String specific functions, such as length, take, words, unwords. Perhaps you could pass in some dictionary and store it in the Box? Or use a type class?

treeowl commented 8 years ago

I thought about this a bit a while ago, but didn't come to any clear conclusion. The trouble is that the constraint(s) end up being a bit ... weird. I'm not (at all) ruling that option out, however. I do live me a good Functor. I'm also not at all convinced that Text is a sensible choice of base type here. If I could, I'd probably use Kmett's Rope, but under the circumstances I think that would be an unacceptable dependency (I don't think Agda depends on trifecta, and I don't want to be the one to make it). That leaves open the option of Seq Char, which seems a really good fit in some ways, but has lousy constant factors. I'd love some realistic benchmarks to help judge. On Mar 13, 2016 1:26 PM, "Harry Garrood" notifications@github.com wrote:

@treeowl https://github.com/treeowl How would you feel about something like this?

data BoxOf a = { ... , content :: ContentOf a }data ContentOf a = Blank | Text a | Row [BoxOf a] | ... type Box = BoxOf Stringtype Content = ContentOf String

We'd need to figure something out about String specific functions, such as length, take, words, unwords. Perhaps you could pass in some dictionary and store it in the Box? Or use a type class?

— Reply to this email directly or view it on GitHub https://github.com/treeowl/boxes/pull/1#issuecomment-196006343.

hdgarrood commented 8 years ago

Just from my perspective, I would really love if you could put any type inside a Box, as this would allow us to unify the two different pretty-printers inside the PureScript compiler (sometimes we want arbitrary metadata attached to parts of strings inside a Box).

treeowl commented 8 years ago

@hdgarrood, that's a very reasonable request, and I'll be happy to make that change or something similar. I have also thought some extra metadata could be good (e.g., color and style). The big question is what types the box-manipulation functions should have. For some, generalizing to Monoid a will do the trick just fine. For others, it will not. Another point in the design space:

data ContentOf f a = Blank | Text (f a) | ...

This lets you use, e.g., Traversable f => ... but makes Text, ByteString, etc., more annoying to handle. Another option could be to add another layer for that:

newtype ContentOf1 f a = ContentOf (f a)
treeowl commented 8 years ago

I suppose actually that your point in the design space is probably better; if someone wants to work at the character level, they can just fmap or whatever twice.

treeowl commented 8 years ago

On the other hand, it may make sense to use something other than lists to represent rows and columns.

hdgarrood commented 8 years ago

Is that just from a performance perspective, or something else?

treeowl commented 8 years ago

Oh, pure performance for the row/column representation. I suppose in practice O(n^2) is probably fine, but it's kind of ugly.

On Sun, Mar 13, 2016 at 5:29 PM, Harry Garrood notifications@github.com wrote:

Is that just from a performance perspective, or something else?

— Reply to this email directly or view it on GitHub https://github.com/treeowl/boxes/pull/1#issuecomment-196056342.

treeowl commented 8 years ago

Did you have any particular thoughts about how to generalize the functions from String? A one-off class certainly could work, but it's kind of ugly.

On Sun, Mar 13, 2016 at 5:34 PM, David Feuer david.feuer@gmail.com wrote:

Oh, pure performance for the row/column representation. I suppose in practice O(n^2) is probably fine, but it's kind of ugly.

On Sun, Mar 13, 2016 at 5:29 PM, Harry Garrood notifications@github.com wrote:

Is that just from a performance perspective, or something else?

— Reply to this email directly or view it on GitHub https://github.com/treeowl/boxes/pull/1#issuecomment-196056342.

hdgarrood commented 8 years ago

Yeah, on second thoughts, I'm not very keen on that either. I might try to implement this and see what happens.

JAnthelme commented 8 years ago

Hi. I am currently using rustydc's code in order to output boxes containing Unicode characters as Text. Is there a plan to convert the official library to Text type outputs (instead of String type ones)?

treeowl commented 8 years ago

Not a plan, exactly, but you've asked at a good time. Some seemingly unrelated work I've been doing may point to a principled way to abstract over the text type. I'll have to take a look and see if it really does.

On Aug 27, 2016 5:56 AM, "Jean A" notifications@github.com wrote:

Hi. I am currently using rustydc's code in order to output boxes containing Unicode characters as Text. Is there a plan to convert the official library to Text type outputs (instead of String type ones)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/treeowl/boxes/pull/1#issuecomment-242908072, or mute the thread https://github.com/notifications/unsubscribe-auth/ABzi_dSSW6YwyX60qJ311Ggj2RWWZozxks5qkAnYgaJpZM4DR9jm .

treeowl commented 8 years ago

No, I don't think so actually. The problem is that different string representations have entirely different performance characteristics. Have you tried just unpacking your text, using boxes to make a string, and packing the result back up?

On Aug 27, 2016 6:47 AM, "David Feuer" david.feuer@gmail.com wrote:

Not a plan, exactly, but you've asked at a good time. Some seemingly unrelated work I've been doing may point to a principled way to abstract over the text type. I'll have to take a look and see if it really does.

On Aug 27, 2016 5:56 AM, "Jean A" notifications@github.com wrote:

Hi. I am currently using rustydc's code in order to output boxes containing Unicode characters as Text. Is there a plan to convert the official library to Text type outputs (instead of String type ones)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/treeowl/boxes/pull/1#issuecomment-242908072, or mute the thread https://github.com/notifications/unsubscribe-auth/ABzi_dSSW6YwyX60qJ311Ggj2RWWZozxks5qkAnYgaJpZM4DR9jm .

JAnthelme commented 8 years ago

Thanks. The issue is that I need to represent non roman characters (e.g. Chinese characters) inside the boxes and with Text this becomes straightforward. With Strings I haven't really thought about it, but I will try your suggestion anyway.

treeowl commented 8 years ago

I don't think it's likely to be much different. The biggest challenge, I imagine, may be dealing with mismatches between the number of codepoints and the number of glyphs. Neither String nor Text will help with that, if you encounter it. Boxes makes the fundamental assumption of a fixed-width representation of each Haskell Char, which is to say each Unicode codepoint. While this assumption is likely wrong in many cases, getting rid of it is pretty much impossible without greatly generalizing the output format.

On Aug 27, 2016 11:35 AM, "Jean A" notifications@github.com wrote:

Thanks. The issue is that I need to represent non roman characters (e.g. Chinese characters) inside the boxes and with Text this becomes straightforward. With Strings I haven't really thought about it, but I will try your suggestion anyway.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/treeowl/boxes/pull/1#issuecomment-242923928, or mute the thread https://github.com/notifications/unsubscribe-auth/ABzi_ZYx49UXjOzFOFiKeU4VzmkaMYpUks5qkFksgaJpZM4DR9jm .

wavewave commented 6 years ago

I've updated @rustydc 's old branch to the up-to-date version here: https://github.com/wavewave/boxes/tree/data-text