haskell / text

Haskell library for space- and time-efficient operations over Unicode text.
http://hackage.haskell.org/package/text
BSD 2-Clause "Simplified" License
406 stars 158 forks source link

Add tshow function #183

Closed mightybyte closed 1 month ago

mightybyte commented 7 years ago

In multiple different projects that I have worked on I've encountered variations on the following function:

tshow :: Show a => a -> Text
tshow = T.pack . show

It would be great if the text package could add this to eliminate the redundancy and confusion.

RyanGlScott commented 7 years ago

A variant of this has been proposed before (and rejected) in https://github.com/bos/text/issues/106.

mightybyte commented 7 years ago

That's actually a different thing. The text-show package gives you a standalone type class specifically for the purpose of converting to a Builder. I'm specifically talking about the existing Show class. That package also has more dependencies than text, and is therefore not an acceptable answer for what I want.

ryantrinkle commented 7 years ago

Perhaps it would make sense as Data.Text.show? In the spirit of many other functions that are defined in Data.Text and generally imported qualified (e.g. null, map, reverse, etc.), this would be a direct analogy of Prelude.show.

mightybyte commented 7 years ago

I don't have much of an opinion what the name is. I'm just interested in squashing the unnecessary proliferation and fragmentation of pointless reimplementations of this function.

alexeyzab commented 7 years ago

Hi there! Is this issue still relevant? If so, would any of the maintainers be interested in providing some mentorship/general help on resolving this so that the newcomers and first-time contributors would have an easier time?

I would like to add this as one of the issues for the upcoming Haskell Weekly's Call for Participation section. See discussion in #75.

These are the guidelines we'd like to stick to in the future:

Thank you!

mightybyte commented 7 years ago

This is definitely still relevant, and it's a very easy issue to fix. The main question at this point is maintainer buy-in.

taktoa commented 7 years ago

Personally, I've written this function upwards of 20 times. +1

codygman commented 7 years ago

I'd prefer the minimum possible code for a performant text show be added to text and to implement this function with that.

On Jul 13, 2017 11:59 AM, "Remy Goldschmidt" notifications@github.com wrote:

Personally, I've written this function upwards of 20 times. +1

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bos/text/issues/183#issuecomment-315139223, or mute the thread https://github.com/notifications/unsubscribe-auth/AANyNy8i4HiUZMy0dTDc5OvTSlRbINXfks5sNkzVgaJpZM4NmxR5 .

ryantrinkle commented 7 years ago

@codygman That makes sense, although I think there might actually be two different things people are talking about here:

  1. An alias for T.pack . show, just because it's super annoying to type that out all the time.
  2. A more performant way of producing Text representations of datastructures

To me, these both seem very useful, and possibly quite different. For (2), I'm not sure what kinds of implementations people have in mind, but it seems like it might be a lot more involved than (1). In particular, it's not clear to me whether or how we can enforce that (2)'s instances agree with "classic" Show instances.

If it makes sense to others here, I would personally really like to see a solution to (1) happen independently of (and sooner than) a solution to (2). However, I can't say I've thought about the matter deeply, and I certainly wouldn't want something introduced that makes things tougher down the line.

RyanGlScott commented 7 years ago

Indeed, I'd be fine with fixing (1) in isolation.

As for (2), there are a number of solutions currently available. There's @bos's own text-format, and there's also my own text-show library, which aims to be character-for-character compliant with Show for Strings.

mightybyte commented 7 years ago

This issue is specifically asking for (1), not (2).

RyanGlScott commented 7 years ago

Sure, I'm not proposing that we implement (2) in text. (I only mention it since some folks here appear to want (2), or are at least waggling their eyebrows furtively in that general direction.)

codygman commented 7 years ago

2 would address 1 wouldn't it? I think that adding the a show based version goes against the performant nature of the text library.

I envision hundreds of future newbie's recounting experiences along the lines of:

"I tried the string library and it was slow, I tried text (using Text.show) and it was slow. Haskell is slow"

This can be avoided by only providing 2 without any loss in functionality.

On Jul 13, 2017 3:52 PM, "Ryan Scott" notifications@github.com wrote:

Sure, I'm not proposing that we implement (2) in text. (I only mention it since some folks here appear to want (2), or are at least waggling their eyebrows furtively in that general direction.)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bos/text/issues/183#issuecomment-315198793, or mute the thread https://github.com/notifications/unsubscribe-auth/AANyN66Jpd1a5dCHNeU9DMYhHEVAKqGMks5sNoOQgaJpZM4NmxR5 .

mightybyte commented 7 years ago

@codygman No, 2 does not address 1. Sometimes you actually want the existing Show type class. If we could rewind the clock and make Show render to Text instead of String, then 2 could address 1. But we can't, and therefore there are times when you specifically want Show because that is what is used by other code. It also should not be called show because that conflicts with the function exported by Show. It should be called tshow or something similar.

eyeinsky commented 7 years ago

I'd like this too! I've had tshow and tlshow available in most code I've written. Since these functions would be going into Data.Text/Data.Text.Lazy modules directly then I'd call them both show -- there's plenty of name overlap already, so people are already accounting for that.

@godygman I don't think hurt to the reputation about being slow will be an issue since due to the same overlap issue newbies need to do some work to actually use these functions, either by hiding show from prelude or explicitly qualifying it. In both cases they would probably encounter the documentation which explains the implications.

bos commented 7 years ago

Sorry, folks, I'm not going to take this. I know this is a widely used function, but saving 5 or 6 keystrokes isn't worth expanding an already-big API.

mightybyte commented 7 years ago

Hey Bryan, it's really about a lot more than 5 or 6 keystrokes. It's about uniformity and standardization. In different places and projects I've seen it called tshow, showt, and present. This presents a much higher cognitive overhead for everyone because people have to remember which variant is being used in the context they're currently working in. On top of this, people have to figure out where to put the function, as it doesn't have an obvious home in those contexts. The payoff / effort ratio is really high here.

kindaro commented 6 years ago

I revisited this ticket a few times, and however I try to persuade myself in the argument of @bos, my opinion remains unwavered. I must respectfully express my disagreement. I am also going to support it by mathematics.

The strategic consideration that I do not see mentioned before is that Text is supposed to supersede String in most regards. It already offers a replacement for many Stringy Prelude functions, such as readFile. There is no reason to make Show an exception, and, while it is infeasible to change the type of Prelude.show at the moment, there is nothing to fetter the provision of a drop-in replacement that is automagically consistent with the usual show.

In JavaScript, there would already be a widely used single function package on npm solving this problem. Haskell does not have this micropackage culture, so the maintainer is the king. This one time, the king's judgement is unsound.

kindaro commented 1 year ago

It is Monday, 3 of October 2022. I am writing this function again.

@Bodigrim please revisit this situation.

Ideally maybe we can also add tread ∷ Read α ⇒ Text → Maybe α, such that tread ∘ tshow = Just for standard types.

If we cannot even add this small function, then how can we ever hope to replace String with Text in base?

Bodigrim commented 1 year ago

Fundamentally I'd like tshow = Data.Text.pack . Prelude.show to be clear about its performance, which is even worse than Prelude.show itself, because an entire String must be allocated to calculate its length, while Prelude.show can at least in theory fuse.

I have a question to the proponents of tshow. Why isn't text-show satisfactory enough for you?

phadej commented 1 year ago

I definitely had written code like toUrlPiece (UserId i) = tshow i, where i :: Int and toUrlPiece @UserId :: UserId -> Text, in fact http-api-data itself provides showTextData which by default uses Show. (I probably should change that to always use text-show, but I'm not in the mood doing major releases atm).

TL;DR existence of text-show is hard to remember (or know!) and such a nuisance to use if I only want to convert an Int to a Text.

EDIT: also Show is easy to derive for about anything and is derived for most types in other libraries, text-show is barely used. So if say a logger takes Text and you want just put something there, tshow is very tempting.

Bodigrim commented 1 year ago

Converting Int to Text is stupidly difficult indeed, I'm absolutely open to offering foo = Data.Text.Lazy.toStrict . Data.Text.Internal.Builder.toLazyText . Data.Text.Lazy.Builder.Int.decimal from Data.Text, as this often involves importing three otherwise redundant modules.

Wrt discoverability, we can certainly advertise text-show directly from class Show haddocks in base.

Bodigrim commented 1 year ago

Let me put it this way: the right thing to do is to use text-show. I'm worried that offering tshow from text will quash any incentive to do the right thing.

codygman commented 1 year ago

I think that a somewhat loud or at least hard to ignore message discouraging users from using Show and to default to text-show would strike a good balance.

phadej commented 1 year ago

@codygman ... Show isn't bad. It's bad if the end goal is to convert to Text, but that's not the only use-case for Show.

mightybyte commented 1 year ago

Let me put it this way: the right thing to do is to use text-show. I'm worried that offering tshow from text will quash any incentive to do the right thing.

I strongly disagree. You cannot impose a one-size-fits-all definition of right in this situation. They're two different type classes and they have two different purposes. As I mentioned above, if I'm using Show (and therefore thinking about tshow), I almost never want text-show. I want Show and precisely Show because there's a whole ecosystem of reasons to use Show. If I want something more suited to performance or whatever my purposes are, I usually make something specific to my needs rather than bringing in a new dep like text-show (which is also not all that light). If you feel that strongly about the performance ramifications of tshow just slap a big fat haddock warning and be done. I just want to stop rewriting this function every time I start a new Haskell project.

Bodigrim commented 1 year ago

They're two different type classes and they have two different purposes.

Could you please elaborate? TextShow is char-to-char compatible to Show.

mightybyte commented 1 year ago

In the vast majority of situations I don't care about char-to-char compatibility or performance. I care about avoiding an additional (heavy) dep way more.

Bodigrim commented 1 year ago

The dependency footprint for text-show is less than, say, for aeson. Which of dependencies are the most problematic from your perspective?

mightybyte commented 1 year ago

All of them. You shouldn't have to incur a single extra dependency just to get a composition of two functions you're already depending on. It would still be unacceptable if its only dependency was base.

Bodigrim commented 1 year ago

You shouldn't have to incur a single extra dependency just to get a composition of two functions you're already depending on.

And you don't have to; use ..

This is not an argument: clearly you are not asking for any composition to be included.

emilypi commented 1 year ago

I think the thrust of @mightybyte's frustration here is that tshow as described above and by others is useful enough to be ubiquitous in everyone's codebases for a variety of reasons that we don't care to police with something like text-show, which is useful for other reasons. Certainly in my use cases at least, we use tshow to quickly get to a Text value, when we know the Show instance is what we want, just as text. We don't care about the performance aspects of this conversion because the asymptotics are irrelevant: we are always dominated by IO, and the values are never very big. In these cases, I'd rather not incur the burden of another maintainer for yet another package from which I'll use perhaps 1-2 lines of code. It's very clear we're all writing the same code with the same thoughts in mind.

I propose the following:

  1. Add tshow with a caveat that it's a very "dumb" conversion
  2. Pipe people towards using text-show for performant text/string conversions in the cases where they need it.

As long as it's very clear what the morally blessed-way of doing things is, I don't think it will defeat any incentives to do the right thing. I think we're just diverging along the lines of what we find valuable. It's very clear that it's valuable simply in the number of times people have written the function independently and continued to want/ask/need it.

codygman commented 1 year ago

@codygman ... Show isn't bad. It's bad if the end goal is to convert to Text, but that's not the only use-case for Show.

In practice for me this has almost always been the case to be honest.

I want Show and precisely Show because there's a whole ecosystem of reasons to use Show

I agree with this and do sometimes use Show to one-off serialize/deserialize something while debugging. Show instances are used in hspec, quickcheck, etc.

My main annoyance is the conversion busywork created by Show being string in text-by-default codebases combined with a desire for Haskell base to move to text by default. It's precisely because of conventions and ecosystem that make that move hard.

I can personally avoid tangled webs of String -> Text conversion because I've internalized rules to avoid them, but there can be interesting puzzles of needless conversions chained with conversions you come across in the wild. I'd like to prevent those.

With that said, I think @emilypi's suggestion fixes the problem of people writing their own tshow and especially combined with nudging users towards text-show moves the needle in the direction of solving the problems I've outlined.

emilypi commented 1 year ago

Thanks for the support! Alternatively, if someone wants to raise the proposal to the CLC, the more general form of this is

sshow :: (Show a, IsString b) => a -> b
sshow = fromString . show

Which would have a more appropriate home in Data.String along with the IsString class. I would vote to approve such an addition, since the cascading benefits are immediate and code duplication is minimized.

eyeinsky commented 1 year ago

I was going to say that rather than naming it tshow to name it directly show instead. Although sounds like a bad idea initially then consider that Data.Text already massively clashes with functions on lists, Char and containers, and is thus always imported qualified, so getting to use T.show instead of T.tshow looks better IMO.

But now, the sshow above I like much more! Perhaps we could bikeshed the name, but having that implemented in Data.String would pretty much solve the issue.

kindaro commented 1 year ago

The problem with sshow is that it is polymorphic. I like me a monomorphic function, because it makes type inference easier. This is important: when working with fancy libraries like servant or opaleye, one needs to care about type inference!

I should like to have Text.show instead of Text.tshow too (indeed I always import Data.Text qualified), but I expect that conservatives will push against Text.show twice harder… I can live with Text.tshow, and this is a case where a sparrow in your hand is better than a crane up in the sky. If there will be a vote, my voice is for Text.show without the t.

Bodigrim commented 1 year ago

I don't see much technical merit, but I'm not fundamentally opposed and it seems there is a social merit in resolving this. However, could someone lead a community-wide discussion of a proper name (tshow, showt, or just T.show) and place (I'm sympathetic to Emily's idea to put sshow or something similar in base)?

L0neGamer commented 1 month ago

I've conducted a poll on the discourse, which can be seen here, and which I'll close soon: https://discourse.haskell.org/t/vote-on-naming-of-show-a-a-text-function-in-data-text/10106.

The poll allows for multiple choices so the below percentages don't quite add up to 100.

At time of typing, T.show has 75% of the vote, followed by tshow at 25%. showt has 11%, and only one person has suggested something other than these options (that being showText), and I assume their singular vote is "other". These results are from 62 voters and 72 votes.

I did not discuss the location of this function, but I can conduct another poll if wanted.

Bodigrim commented 1 month ago

Thanks @L0neGamer!

I did not discuss the location of this function, but I can conduct another poll if wanted.

Do you envisage any options other than Data.Text / Data.Text.Lazy?

L0neGamer commented 1 month ago

For the just Text variants I can't imagine anywhere else would make sense (unless someone were to take more drastic measures and bring Text into base). Especially since the vote seems to be for qualified variants, it makes a lot of sense to have one in each of Data.Text and Data.Text.Lazy.

Bodigrim commented 1 month ago

I'm content to follow vox populi and add Data.Text.show / Data.Text.Lazy.show. @Lysxia what do you think?

maxigit commented 1 month ago

Is it not a bit strange to shadow the method of a class you expect people to import ? You can't do import Data.Text anymore without having name shadowing issue.

People who want to use Prelude.show now have to do

import Data.Text hiding(show)

People who want to use Data.Text.show have to do either

import Prelude hiding(show)
import Data.Text

or to use T.show (vox populis)

import qualified Data.Text  as T
import Data.Text hiding (show)

Is it really what people voted for ????

Bodigrim commented 1 month ago

You are not supposed to import Data.Text unqualified, it will clash with a myriad of other functions and class methods from Prelude.

L0neGamer commented 1 month ago

See: head, last, tail, init, length, null, map, reverse, all the fold functions, concats, any, all, maximum, minimum, and the list goes on.

Over two days over 60 people replied to the poll, and 75% of them were in favour of the qualified name approach.

eyeinsky commented 1 month ago

Is it not a bit strange to shadow the method of a class you expect people to import ? You can't do import Data.Text anymore without having name shadowing issue.

People who want to use Prelude.show now have to do

import Data.Text hiding(show)

People who want to use Data.Text.show have to do either

import Prelude hiding(show)
import Data.Text

or to use T.show (vox populis)

import qualified Data.Text  as T
import Data.Text hiding (show)

Is it really what people voted for ????

This is all discussed above. You can't really import Data.Text unqualified, an qualified, show is much nicer name than tshow etc.

maxigit commented 1 month ago

@Bodigrim

You are not supposed to import Data.Text unqualified, it will clash with a myriad of other functions and class methods from Prelude.

Fair enough, but lots of people (including me) don't or only partially and use Text, pack, instead of T.Text, T.pack etc ... My argument falls indeed apart as I am alreadyg doing import Data.Text (Text,pack,unpack) ... so I don't need to hide show.

@eyeinsky

show is much nicer name than tshow

True, but as pointed you can't really import Data.Tex unqualified which mean you are not using show but T.show. I find tshow nicer than T.show (which look to me like a constructor and is longer to type).

@L0neGamer

Over two days over 60 people replied to the poll

Over 1 month over 600 people (discourse, reddit ...) would have had more weight in my opinion. Two days especially during summer holidays means that a few people didn't have the chance to even see the poll before it was closed. It seems a bit rushed to me, especially considering that the pull request has been opend 7 years ago ...

L0neGamer commented 1 month ago

My apologies, I left it open for a couple of days, and then let bodigrim know what the current results were. Things moved quickly after that, and I only closed the poll after the additions were merged. If I do this in future I'll try to leave it open for longer and be clear when reporting results that we're waiting for further feedback.

Bodigrim commented 1 month ago

@maxigit there was no release yet, so in principle if you bring up convincing arguments the decision can be re-evaluated. E. g., you are welcome to conduct a poll on another platform over a longer period and see if it shows drastically different results. Or you can measure user habits with regards to qualified / unqualified imports, if this is what advances your argument.

maxigit commented 1 month ago

@Bodigrim I have a look on Hackage Search and find out that indeed Text is overwelmingly imported qualified, so another poll would probably give similar result. However that doesn't mean that the majority should make the life of the minority miserable.

Even though Text should be import qualified, there is about 373 packages on hackage who don't (search for 'import Data.Text$'), not many I agree but they might now be forced to hide show which wasn't a problem before. 2273 who import some bits (search 'import Data.Text ('), most of it being some variation of import Data.Text (Text) and import Data.Text(Text, pack, unpack). The 2273 include some widely use packages as aeson, cassava, conduit, haddock, pandoc, persistent etc ... 812 packages import pack unqualified (Hackage Search: import Data.Text.*pack, I am pretty sure that most of them use somewhere pack . show and would be happy to import tshow or equivalent. They won't be able to import show in the same way.

Indeed, most people import Data.Text qualified but maybe not by choice. Adding show will encourage (strongarm?) more people to do so. You can see it as a good or bad thing.

Another alternative, as outrageous as it looks, miight be to add both show (T.show) and tshow a few people might still need to hide show somewhere, people would be able to use T.show and other (like) to import and use tshow.

Ultimately, I respect you and trust your judgement. I put my case forward and let you make an informed decision.