goldfirere / singletons

Fake dependent types in Haskell using singletons
286 stars 36 forks source link

Improve documentation of Data.Singletons #599

Open tomjaguarpaw opened 4 months ago

tomjaguarpaw commented 4 months ago

Currently I would say that singletons is daunting. This should not be! The underlying principle is straightforward. It took me a long time to understand it, but now that I understand it, it's simple 😂 Now, I did have to pretty much reinvent it before I could understand it. I lay the blame at the feet of the documentation.

There are substantial changes that could make the documentation more approachable. This PR is a small effort in that direction. It contains two minor Haddock fixes, and a pervasive change which I hope will improve the situation a lot:

Don't think of Sing/SingI as indexed by a type of a given kind, think of it as indexed by a value of a given type.

This PR changes the variable k to t in many places to aid this change in thinking.

I really don't think it's helpful to look at the below case statement and interpret b as a variable standing for something of kind Bool, either type True or type False. That may or may not be technically correct (I don't know) but I think it is far more helpful for the intuition if I think of it as something of type Bool, either value True or value False, just used at the type level.

case sing @b of
  STrue -> ...
  SFalse -> ...

I think this way of thinking, encouraged by the notation in this PR, would go a long way to resolving the kinds of difficulties we see in, e.g. https://github.com/goldfirere/singletons/issues/260#issuecomment-1353244384

What do you think? I have a number of other suggestions for documentation improvements if you like this one.

RyanGlScott commented 4 months ago

Thanks for the PR!

Don't think of Sing/SingI as indexed by a type of a given kind, think of it as indexed by a value of a given type.

Where does this quote come from, out of curiosity? Speaking for myself, I only partially agree with it. Sing is absolutely indexed by a type of a given kind in the sense that when you define a Sing instance, you define it for a type, not a value of the type:

type instance Sing @Bool = SBool

That is, you never define a Sing instance for True or False, only for Bool. I suppose you could make the argument that when you write forall (x :: Bool). Sing x -> ..., then Sing is indexed by a value of type Bool (x), but one could just as well make the argument that it's also indexed by Bool, since that type is shorthand for forall (x :: Bool). Sing @Bool x -> ....

On the other hand, the conventions surrounding SingI are a bit different from the conventions for Sing. This is because you don't define a catch-all SingI instance for Bool, but rather individual instances for the values of type Bool:

instance SingI True
instance SingI False

That being said, I don't think it's accurate to characterize this difference as a matter of indexing (the kinds for both Sing and SingI begin with forall k. k -> ..., so one can claim that they're both indexed by a type and a value), but rather a difference in how they're meant to be used.

This point is a bit unrelated to the choice of whether to use k or t as the default variable name throughout the documentation. I don't know if I feel too strongly about k versus t (the use of k in the docs isn't really an intentional choice), but I don't think the choice of t does much to clarify the intended conventions for SingI and Sing. My feeling is that that will only become clear if you write them out in the documentation.

Relatedly, I'm curious to know what these "substantial changes" to the documentation you have in mind are. If I have a better sense of the direction of travel you want to take, it might help me contextualize some of these changes better.


Hair-splitting about indexing aside, the Correct Haddock link: link to Sing pattern, not Sing type family and Fix Haddock commits are clear improvements. If nothing else, we should merge these.

tomjaguarpaw commented 4 months ago

Where does this quote come from, out of curiosity?

Oh, it's not a quote. I Just wanted to highlight what I consider as the "main principle" underlying how I think of this library.

Hair-splitting about indexing aside, the Correct Haddock link: link to Sing pattern, not Sing type family and Fix Haddock commits are clear improvements. If nothing else, we should merge these.

OK, I submitted them separately here: https://github.com/goldfirere/singletons/pull/600

tomjaguarpaw commented 4 months ago

I only partially agree with it. Sing is absolutely indexed by a type of a given kind

Yes, OK, there are invisible type applications floating around, which muddy the water, and in some sense the argument I'm talking about isn't an argument of Sing at all, it's an argument to the result of Sing. Let me try approaching from a different direction.


When I see Sing var/SingI var/Sing True/SingI True I don't think of var and True as respectively unknown and known types. I think of them as unknown and known values. In fact, I think the latter interpretation is essential to onboard onto this library quickly. The former interpretation is also valid, of course (perhaps even more correct, technically -- I don't know) but I think that the associated intuition must only be available to those who are already experts with the library or already have some understanding of circuitous route it is taking in order to encode (a form of) dependent types in Haskell.

If we accept v :: u and True :: Bool as values then u and Bool there are types, not kinds. Therefore it would be clearer to use t to stand in for them than it would be to use k. That is, whereas currently we have

type family Sing @k :: k -> Type
type SingI :: forall {k}. k -> Constraint

it would be clearer to have

type family Sing @t :: t -> Type
type SingI :: forall {t}. t -> Constraint

I don't think it's ever useful to think of the (visible) parameter of Sing or SingI as a type of a particular kind, only as a value of a particular type. I'm open to persuasion that it is useful for experts, in which case I'd weaken my claim: I don't think it's ever useful for new users to think of the (visible) parameter as a type. To summarize my argument:

  1. We want to help new users onboard onto this library
  2. New users will develop intuition quicker if they think of v/Bool in SingI v/SingI True as values
  3. Accordingly, in SingI (v :: u)/SingI (True :: Bool), they should think of u and Bool as types
  4. When we're being generic over such things we should use a variable name that indicates they are types, t (in particular, not k)

I'm curious to know what these "substantial changes" to the documentation you have in mind are

It depends a great deal on the outcome of discussions like this one on earlier changes, but the general principle would be to relatively deprioritize the documentation of less-used pieces. I would say there is a some sort of exponential response to confusing documentation, so documentation that is "one unit" more difficult to understand is ten times more likely to make the reader close the tab, "two units" more difficult one hundred times more likely, etc.. Thus I would move the documentation of SLamba, SingI1 and SingI2 below Sing, SingI and SingKind, and instead give some concrete examples of usage of the latter three.

tomjaguarpaw commented 4 months ago

Oh, and if you want to know what I think very clear documentation looks like for this sort of thing, I refer you to my "reinvention" of singletons. However, that presentation also benefits from being able to freely choose type and class names, which singletons doesn't have the liberty to.

RyanGlScott commented 4 months ago

OK, I think we are largely in agreement then. If you have Sing @t var, then I have no qualms with referring to var as a value and t as a type. As you've noted, an expert would notice that it is also equally valid to refer to var as a type and t as a kind due to the encoding being used, but this observation requires more intimate knowledge about advanced language extensions. I haven't thought deeply about whether one view or the other is more intuitive to a newcomer, but I am certainly willing to believe that the former can be less daunting. (It's difficult for me to make such judgments myself, as I am afflicted with the curse of knowledge.)

In that case, I would welcome this PR as a step in the direction of making the documentation more accessible to newcomers. Perhaps we could even have an "expert section" of the Haddocks that talks more about the details of singletons' encoding for those who want to know more about that specific aspect.

I will note that some of the existing naming conventions reflect an "expert" understanding of how singletons' encoding works, such as SingKind, KindOf, and SameKind. I'm not sure if you have a plan for how to reconcile this—if so, I'd be interested to hear it.

Thus I would move the documentation of SLamba, SingI1 and SingI2 below Sing, SingI and SingKind, and instead give some concrete examples of usage of the latter three.

That sounds entirely reasonable to me. The current order is not an intentional choice by any means, and I would welcome efforts to better organize the Haddocks.

tomjaguarpaw commented 4 months ago

I will note that some of the existing naming conventions reflect an "expert" understanding of how singletons' encoding works, such as SingKind, KindOf, and SameKind. I'm not sure if you have a plan for how to reconcile this—if so, I'd be interested to hear it.

Yes, that's rather awkward. I don't have a good idea about what to do about that! If it was my library I think I would start a slow transition process to rename them, but I realise that's a hard sell.

tomjaguarpaw commented 4 months ago

Oh, and I think it's instructive to note that my version of KindOf is called TypeOf and I'm pretty sure I gave it that name before I saw singleton's KindOf.

RyanGlScott commented 4 months ago

KindOf and SameKind are just type synonyms, so we could just rename them to TypeOf and SameType, making KindOf and SameKind aliases for those who really want the old names. It also helps that most users don't use these type synonyms directly—they mostly act in service of other API functions (e.g., demote).

SingKind is trickier because it's a class, and it's one that is a prominent part of the singletons API. Renaming it to SingType will necessarily break all existing SingKind instances, and you can't define type SingKind = SingType and then define instances for a type synonym SingKind. That being said, I'm not opposed to the idea of just making this name change (we've made breaking changes to singletons of similar caliber in the past), but we should make sure we're absolutely committed to this idea before we do so.

tomjaguarpaw commented 4 months ago

In the second paragraph did you mean SingKind/SingType rather than SameKind/SameType?

RyanGlScott commented 4 months ago

Oops, well spotted. I've edited my comment to fix this.

int-index commented 4 months ago

Sadly, the terminology with regards to terms, types, expressions, kinds, values, etc, is all over the place in Haskell.

The fundamental concepts at play do not cleanly map one-to-one with the words that are used to describe them. There are at least two meanings for "type", the meaning of "kind" changed in 8.0 (and to this day not everyone has caught up), the meaning of "value" is different in colloquial usage and literature, and so on and so forth.

I don't see much sense in fiddling with this: it's going to be subtly incorrect or ambiguous no matter what labels we pick. With regards to the specific change proposed here, I find myself leaning slightly against. Reason: it is too early to move away from the "kind" terminology as long as Demote is part of SingKind. Given some Sing @k t or SingKind k, we still use k in kind positions only (i.e. as a type-of-types) and apply Demote wherever we want a type (i.e. a type-of-terms).