tmhedberg / here

Haskell here docs & interpolated strings via quasiquotation
BSD 3-Clause "New" or "Revised" License
57 stars 14 forks source link

Text interpolates with double quotes #16

Open 3noch opened 8 years ago

3noch commented 8 years ago
> text = "Test" :: Text
> [i|${text}|]
"\"text\""
3noch commented 8 years ago

I'm not sure of a good way to solve this without orphan instances or depending on text... Maybe it's worth depending on text since it might as well be in base after all. ;)

3noch commented 8 years ago

You could possibly take advantage of the fact that Text is an instance of IsString to determine it doesn't need to be quoted.

tmhedberg commented 8 years ago

This is because the type of your quotation is being defaulted to String, so the contained Text must be converted to String in order to be inserted into the surrounding, albeit empty, String value. This conversion is performed by calling show on the interpolated Text, and the Show instance for Text wraps its result in quotation marks.

If you want to interpolate Text values without quotes, you must ensure that the quoted expression as a whole has type Text as well.

> let t = "text" :: Text
> [i|${t}|] :: String
"\"text\""
> [i|${t}|] :: Text
"text"
3noch commented 8 years ago

Hmm...that's a great explanation. Yet I still wonder if this is a good thing. It's extremely surprising behavior to find that your Text values have quotes sometimes but not others!

tmhedberg commented 8 years ago

I must respectfully disagree. The current behavior is, to the best of my knowledge, the only sensible way to handle this situation. :) I could be wrong, and I welcome constructive criticism, so here's my explanation in case you care to read it.

Consider the problem we're facing: we wish users to be able to interpolate arbitrary values of type a into a surrounding context of type IsString b => b. As stated, that's actually impossible, because a could be something like a function type, which has no string-like representation, so what we really mean is that we want to interpolate values of type Show a => a into a surrounding context of type IsString b => b. Show is the class of all types which can be rendered to a string, which are the only types it makes sense to use in this context.

In the end, we need to produce a final value of type IsString b => b which includes the stringified versions of all interpolated values. This means that, unavoidably, we must convert each interpolated Show a => a value into an IsString b => b so that it can be combined with the rest of the quoted expression.

So we need a conversion function:

convert :: (Show a, IsString b) => a -> b

Given the constraints, there is only one way to write this function which will stand up to the type checker's scrutiny, and that is

convert = fromString . show

The only special case is when the interpolated value and the context into which it is being inserted both have the same type, e.g. they are both Text, or both String. We can detect this situation by adding an additional Typeable constraint, which I omitted above for the sake of simplicity. In this case, no conversion is necessary; the values can just be combined as-is. Splicing Text into Text or String into String doesn't require conversion, so it doesn't take a round trip through show. But this is truly the only special case; the desire to be as flexible as possible about the types involved, in order to make the quasiquoter usable in as many contexts as possible, means we can't make any other assumptions about what types we'll be given, or how their Show instances might mangle them during the conversion.

In theory, I could include in this library a class such as the following (using DefaultSignatures):

class Interpolate a where
  asString :: a -> String
  default asString :: Show a => a -> String
  asString = show

This is essentially a custom Show class, allowing users to define their own conversions, which could omit the quotes when converting Text values, etc. But this would require an Interpolate instance to be declared for every type that users want to use, and there's no way I could provide an instance for every imaginable type. This is a requirement which I don't really want to impose, though I'd reconsider that trade-off if a lot of people expressed a differing opinion.

The bottom line is that you must interpolate types into like types if you want to avoid the show conversion, and this requirement follows logically and unavoidably from the constraints of the problem and my desire to avoid requiring custom type classes. I consider this to be working as intended and as desired. After all, if you didn't wish to be so rigorous about types, you probably wouldn't be using Haskell in the first place. :)

In the specific example you raised, if you want to interpolate the Text into a String, without the quotes, you just need to do a little conversion yourself, either [i|${unpack text}|] or unpack [i|${text}|], either of which will give you a quote-free String result.

3noch commented 8 years ago

Thanks for the phenomenal explanation! Here are my 2 cents.

While you're absolutely right about how types line up, there is yet this lingering "annoyance" with the fact that the result of intermingling basic string types means your result will be hard to detect. Let's assume OverloadedStrings and say I write the naive (and very readable) expression

[i|Hello there ${name}, your account is now ${state}.|]

where name :: String and state :: Text. If the context of this expression demands String then the result will be Hello there William, your account is now "ready".. But if the context demands Text then it will be Hello there "William", your account is now ready.. I would argue that neither of those is truly what the author intended. But what's even worse is that it's not at all obvious from the context which of those two it will be. It would be significantly exacerbated if the immediate context actually only demanded IsString a => a, such that the value of this expression could freely polymorph into either of those possibilities based on some distant constraint. Or even worse, it ends up being both in different places in your code base because different use cases demanded different constraints.

What I'm talking about is not even a theoretical problem. It's already a well-known infelicity in Haskell codebases that you often have at least two string types floating around your code (not including ByteString and all the lazy variations). My code base has a logging system which is polymorphic on the string type, so knowing how the log message will display is hard to pin down. Also, you often use Text with databases and such, but String with Either String Value error handling.

Because of this current reality (which no-one enjoys, and why we all love Mr. Yang's recent Backback work!), I find the i quasiquoter to be far too unpredictable for practical use. I already had bug in my SQL because I didn't realize it was putting double quotes somewhere.

Now, the solution you've proposed with the Interpolate class seems very sensible. In fact, other interpolation quasiquoters do that as well. (Why I'm not using them is another topic.)

But this would require an Interpolate instance to be declared for every type that users want to use, and there's no way I could provide an instance for every imaginable type.

But why would something like this not be sufficient?

instance Interpolate Text where
  asString = T.unpack

instance Interpolate LazyText where
  asString = LazyText.unpack

instance Interpolate String where
  asString = id

-- this might be a bad idea, but ByteString has a Show instance so this isn't too crazy.
instance Interpolate ByteString where
  asString = T.unpack . T.decodeUtf8

instance Show a => Interpolate a where
  asString = show
tmhedberg commented 8 years ago

I see what you're saying, and it's a pragmatic viewpoint. I want to argue that if you're regularly mixing Text and String throughout a code base, you're in for a world of pain anyway, but realistically, it's often unavoidable, and I acknowledge that.

I can't change the existing behavior of the library without breaking backwards compatibility, but I could introduce separate modules which would provide an alternative to the existing functionality via the proposed Interpolate class.

With regards to your suggested instances, the first 3 look fine. The ByteString instance I wouldn't include because it necessarily assumes a particular character encoding. The final instance overlaps with all of the others, and I don't want to require OverlappingInstances. But the default implementation in my proposed class declaration obviates that, anyway. If you want to just use the Show behavior for some type Foo, you can just write

instance Interpolate Foo

without any methods.

The main downside, as I mentioned, is that this requires writing the above (brief) instance for every type you want to interpolate, even if you don't want to override the default behavior.

Another option, which some libraries employ in this situation, is to use a newtype:

newtype InterpolateShow a = InterpolateShow a

instance Show a => Interpolate (InterpolateShow a) where
  asString (InterpolateShow a) = show a

This would mean you don't have to write an instance for every type, nor do you have to use OverlappingInstances, but it does mean you have to wrap everything in the newtype constructor, which is probably even more onerous.

Any thoughts?

3noch commented 8 years ago

I want to think more, but what about something really crazy like:

import Data.Type.Equality
-- instances for Text, String, etc.
instance (Show a, a == Text ~ False, a == String ~ False) => InterpolateShow a where
  asString = show
tmhedberg commented 8 years ago

Does that actually avoid overlap with other instances? Instance resolution is done without respect to the context, so no matter what constraints you add there, it still says InterpolateShow a, which will overlap with any other, more specific instance of the class that you define.

3noch commented 8 years ago

Oh snap you're right. I forgot about that. Perhaps we could make a type family that could group things between needing Show or not?

tmhedberg commented 8 years ago

Can you give an example of what you have in mind?

3noch commented 7 years ago

Here are some cool techniques to avoid Overlapping instances: https://kseo.github.io/posts/2017-02-05-avoid-overlapping-instances-with-closed-type-families.html

domenkozar commented 6 years ago

Was just bitten by this - is there anything I could help with getting it forward?

3noch commented 6 years ago

I am personally of the opinion that this is a pragmatic-enough need to justify OverlappingInstances. The only alternative is to require someone to use orphan instances somewhere (either in this library, or in another). I'd take either TBH.

Oh, of course here could depend on Text and there would be no need for orphans but I think @tmhedberg has made it clear that won't happen. @domenkozar I'd be willing to use/make a fork of here that takes a more pragmatic approach.

tmhedberg commented 6 years ago

Personally, I continue to hold the opinion that this is working as intended. If you're mixing different string-like types throughout your code and not keeping track of what's what (e.g. with explicit type signatures), you're inevitably going to have problems that go beyond surprise double quotes. None of the problems discussed in this thread will happen if one pays attention to one's types and adds explicit conversions where necessary. But I do acknowledge that there's room for reasonable disagreement here.

I'd consider a patch to add the Interpolate class as discussed above, along with some instances for common types, and quasiquoters that use Interpolate instead of Show for converting to String. It would need to go into a separate module so that existing uses of the library would continue working as before without any changes.

I'm not interested in anything that requires overlapping or orphan instances, but I don't think either should be necessary to support common use cases. For common types, we can just provide an instance out of the box. For the long tail of uncommon types, we can provide a DefaultSignatures implementation specialized to use Show, as well as the newtype trick I mentioned for people who want to use this with someone else's type while avoiding defining an orphan instance. I think this would cover all of the bases, but I haven't tried it out, so I'm not completely sure.

3noch commented 6 years ago

@tmhedberg That would be great! I don't think anyone disagrees that the current behavior is "wrong". It's more that the current behavior exacerbates an already-wrong issue in that Text is not in base and there is no good class providing a -> String other than Show. And/or that String is ever used at all. (I usually encounter String because of other libraries, not my own code. And they tend to exist in error messages which is where I tend to use logging which is where I tend to use here.)

I'm only a very light user of here at the moment so I may not be able to get a PR up soon but if I end up using here in more earnest I'll definitely get that going. Unless of course anyone else beats me to it.

3noch commented 5 years ago

I had another idea: Introduce a new interpolation syntax that is not polymorphic. For example: [i|My name is $$name and I'm $age years old.|]

Or possibly provide a different quasiquoter that is never polymorphic and simply forces you to write show on your own.