fantasyland / static-land

Specification for common algebraic structures in JavaScript based on Fantasy Land
MIT License
772 stars 41 forks source link

Better term for what we currently call Type or Type Dictionary #32

Closed rpominov closed 7 years ago

rpominov commented 7 years ago

"Type" is the central artefact in the spec, but the name is confusing. The word "type" already has a different meaning in JavaScript, we have types like Array, number, Promise etc.

Fantasy-land recently introduced term "Type Representative". And if we use it in static-land that will be an improvement, but still not good enough. The word "representative" is good, this thing certainly represents something. But does it represent a type? For example what if we have two representatives:

const Addition = {
  empty() {
    return 0
  },
  concat(a, b) {
    return a + b
  },
}

const Multiplication = {
  empty() {
    return 1
  },
  concat(a, b) {
    return a * b
  },
}

What types they represent exactly? Do they both represent number? That doesn't sound right. What they actually represent are collections of operations on type number. Addition represent one such collection and Multiplication another one. So a correct name would be "Collection of Operations Representative", but that is a terrible name of course.

As far as I understand in OCaml similar things are called "modules", that would be a good name if that word didn't already have a different meaning in JS community.

Any suggestions on a better name?

This was previously discussed here https://github.com/rpominov/static-land/pull/30#issuecomment-255787085 .

/cc @tel @isiahmeadows

gcanti commented 7 years ago

Algebras

Monoid is an "algebra" and Addition: Monoid<number> is an "instance of the monoid algebra on numbers" (or "a monoid on numbers" for short)

// "algebra"
interface Monoid<A> {
  empty(): A;
  concat(x: A, y: A): A;
}

// "instance"
const Addition: Monoid<number> = {
  empty() {
    return 0
  },
  concat(a, b) {
    return a + b
  }
}
rpominov commented 7 years ago

Yeah, except what we call "Type" will be "Instance of Algebra" in that case. We actually introduce term "Algebra" already (that definition also has issues, btw).

We can go that way. One problem though is that we would have to introduce "Algebra" before "Instance of Algebra". I think if we introduce concepts in that order it will be harder to understand for newcomers.

Would be better if we found a term that doesn't depend on "Algebra", something that captures just a collection of some operations, not necessarily with some laws.

rpominov commented 7 years ago

Also I want to find a term that will work with https://github.com/fantasyland/fantasy-land/issues/199 . Not sure "Instance of Algebra" will work well there.

rpominov commented 7 years ago

One more thought: this thing not even an instance of an Algebra, but collection of instances of Algebras. For example this:

const Addition = {
  empty() {
    return 0
  },
  concat(a, b) {
    return a + b
  }
}

contains an instance of Monoid and an instance of Semigroup.

So at first level it's just a collection of operations. Then we can find instances of some algebras in that collection of operations.

gcanti commented 7 years ago

An algebra (or algebraic structure) is some set A along with some number of functions closed over the set. For example a monoid algebra with the carrier set A is the following tuple

(A, *, u)

where

*: A x A -> A
u ∈ A

such that * is associative and u is a unit, that is u * a = a * u = a for all a ∈ A.

An example of monoid is (N, +, 0) where N is the set of all numbers, +: N x N -> N is the usual addition and u = 0.

Now if we rename * to concat and u to empty, a generic monoid over A is represented by the following interface

// this encodes the (A, *, u) monoid algebra
interface Monoid<A> {
  empty(): A;
  concat(x: A, y: A): A;
}

And a particular instance of monoid over numbers is the following instance

// this precisely encodes the `(N, +, 0)` monoid
const Addition: Monoid<number> = {
  empty() {
    return 0
  },
  concat(a, b) {
    return a + b
  }
}

IMO it's not just a collection of operations, otherwise we should also say that (A, *, u) is just a tuple and not a monoid algebra. It's our interpretation and the presence of laws (that unfortunately we can't encode easily) that makes the difference.

For what concerns monoids and semigroups, a monoid is a semigroup with a unit element, then we can write

interface Semigroup<A> {
  concat(x: A, y: A): A;
}

interface Monoid<A> extends Semigroup<A> {
  empty(): A;
}

Groups are monoids with inverses

interface Group<A> extends Monoid<A> {
  inverse(a: A): A;
}

such that etc...

We can say that (N, +, 0) is a semigroup even if we know that is also a monoid. Likewise we can say Addition is a semigroup even if we know that is also a monoid.

rpominov commented 7 years ago

I totally understand what you are saying. We just can have more that one perspective on this, and we should choose a perspective that suits the best for communicating meaning of specification to readers.

For example let's consider this object:

const Addition = {
  empty() {
    return 0
  },
  concat(a, b) {
    return a + b
  },
  subtract(a, b) {
    return a - b
  },
}

First of all it's just a collection of operations, but on the other hand some operations in it form instances of algebras:

image

Now we need to decide what perspective is more useful in spec. What we should try to capture in the name? The name should work well when we reference these objects in specification. We should be able to say "value must have a property that points to corresponding %our_name_for_these_things%", so we should also focus on what these things are in JS (objects interpreted as dictionaries). Also there is a concern about order in which we introduce concepts that I've mentioned before.

Having all this in mind, I think we should capture in name a lower level essence of these things — just a collection of operations. Then in further section of specification we introduce algebras, that are requirements for these objects (like "it must have that method" and "this must equal that").

dead-claudia commented 7 years ago

Maybe "Type instance"? In Fantasy Land's spec, it would unambiguously from context represent the actual instance, not the type itself. In Static Land's spec, it would unambiguously from context represent what is currently known as the "type object".

I will point out that if you want to unite the two specifications, it's much easier to work with something a little more like Static Land than Fantasy Land, because from a library writer's standpoint, it's equally easy to implement either version, and in practice this is hardly useful at the API surface when implementing either spec.

Also, many libraries already somewhat conform to Static Land in its current form by pure coincidence, because static methods are frequently easier to use than instance methods (no this binding, etc.), and a good example of this is most.js.

As for wrapping a Static Land type to an FL type, it's also much easier than vice versa (meaning, it's more flexible), and currying is much easier than with FL methods:

function toFLType(T) {
  function F(value) { this._ = T.of(value) }
  function poly(ms, f) { ms.split(" ").filter(m => T[m] != null).forEach(f) }
  poly("empty of chainRec", m => F[m] = T[m].bind(T))
  poly(
    "equals concat map bimap promap ap chain reduce extend extract traverse",
    m => F.prototype[m] = function (...xs) { return T[m](this._, ...xs) }
  )
  return F
}
tel commented 7 years ago

I think "dictionary" is an appropriate term here. It's worth noting that for generality it's a good idea to drop the notion that these things are tied to a single type. For instance, here's a signature (psuedocode) describing a perfectly reasonable "set dictionary" that uses two types

signature
  type S
  type E

  insert : (S, E) -> S
  find : (S, E) -> Boolean
  count : S -> Int
end

Multiple types (to follow gcanti's examples) are covered by algebras via "indexed algebras" but that starts to get pretty hairy. Dictionary is used a little in Js contexts, but I don't think it sufferd as much from semantic collision as "module" does.

It also lets you say "this dictionary relates types S and E" which is a little clearer than "this type relates types S and E".

As a final note, it's worth keeping "runtime types" away from "compile-time types" to steer away from eventual confusions around coercions and type reflection.

tel commented 7 years ago

Also, you'll probably want to start considering the notion of "signatures", too. You use these terms like this

etc etc

dead-claudia commented 7 years ago

@tel SGTM. Might seem a little foreign initially to those less familiar with ML terminology, though.


I'm fairly neutral on this, so as long as it ends up something more descriptive than "Type Representative" and less confusing than "Typeclass" (vs JS classes), I'm good. :smile:

tel commented 7 years ago

@isiahmeadows I agree, but this whole concept will take a little time to digest anyway, I feel. It turns OO/prototyping inside out anyway.

Also, I want to go on record and say that "type representative" is outright wrong. These "dictionaries" or whatever aren't representing types. A type representative is a runtime value with the same structure as (a subset of) the static types of your program. They can be used for typesafe coercions and many of the use cases for runtime reflection without actually having reflection.

rpominov commented 7 years ago

Maybe "Type instance"?

I think we should remove word "type" from the name. This thing is not a type. Consider Addition/Multiplication example from the first comment. We have two almost identical objects they have same methods and work with same type, the only difference is the logic inside these methods. It's more like a strategy pattern from OOP.

I will point out that if you want to unite the two specifications, it's much easier to work with something a little more like Static Land than Fantasy Land

💯 See https://github.com/fantasyland/fantasy-land/issues/199#issuecomment-257142130 . I think the new specification should be structured more like static-land, with additional section about spec compatible values closer to the end. Although that's just my opinion.

I think "dictionary" is an appropriate term here.

"Dictionary" sounds like a too broad term. Although maybe I just need a little getting used to.

As a final note, it's worth keeping "runtime types" away from "compile-time types" to steer away from eventual confusions around coercions and type reflection.

Yea, I was thinking that maybe we should introduce our own definition of "type". Which should be more like Flow/TypeScript types. In these systems a type can be for example "a set of all numbers and string 'five'" (type Foo = number | "five"). Or we can have a parametric type in it that looks like this type Maybe<T> = {tag: 'just', value: T} | {tag: 'nothing'}, which in common JavaScript notion of types would be just "objects". The definition of type from Flow/TypeScript looks like something we can work with.

Also, you'll probably want to start considering the notion of "signatures", too. You use these terms like this

Could you define "signatures"? Not sure that I understand what you mean. Btw, are these terms "signatures" and "indexed algebras" coming from OCaml?

Also, I want to go on record and say that "type representative" is outright wrong.

I think it works fine in current fantasy-land, because there it always tied to a single type as halve of the methods still live in prototype. But when we can have more than one dictionary for one type this doesn't work as well.


One more thing I want to mention. Whatever name we choose we should keep in mind that these dictionaries may contain not only functions in the future. For example empty() / zero() may become just values, also we may need to add meta algebras property because detecting supported algebras based on presence of methods may not always work. So a future dictionary may look something like this:

const Addition = {
  empty: 0,
  concat(a, b) {
    return a + b
  },
  algebras: ['Semigroup', 'Monoid'],
}

Looking at this I realize that the name "module" fits very well, maybe we should use it after all even though it already has another meaning.

rpominov commented 7 years ago

Here is a draft that I have in mind of how concepts can be introduced:

Static type

Here goes definition of type similar to what used in systems like Flow/TypeScript. It should allow types like type Foo = number | "five" or type Maybe<T> = {tag: 'just', value: T} | {tag: 'nothing'} and also have some notion of compile-time known types. So a "static type" is a compile-time known type that can be something like number | "five" etc.

Module (temporal name)

Module is an immutable dictionary (JavaScript object interpreted as dictionary and never mutated) that contains some functions and values. A module usually relates to some static type, which means that its functions work with values of a particular static type.

We shouldn't complicate thing with notion of modules that relate to more than one type ("set dictionary" example). We probably wont need something like this for the spec.

Algebra

Here we define an algebra as a set of restrictions for a Module. For example in order to support Semigroup algebra a module must contain concat method and equation S.concat(S.concat(a, b), c) ≡ S.concat(a, S.concat(b, c)) must hold.

tel commented 7 years ago

This is starting to sound really good. I'll reply on important quotes then summarize.

Yea, I was thinking that maybe we should introduce our own definition of "type". Which should be more like Flow/TypeScript types. In these systems a type can be for example...

This is more or less necessary, though it can be done implicitly for a long time. "Static type" is, more or less, "a formal description of what we know about a variable/name at compile time". It's probably a good idea to keep these vague, but also simple. TypeScript's singleton and union types such as number | "five" have intuitive basic behavior, but in terms of the logical system that they support ("what we know" implies logic, and, or, implication, all that jazz) they can get complicated.

Of course, that's why an informal specification of the formal language of "what we know about a name at compile time" is probably best. Let people argue over informal semantics to make it stronger over time.

That all said, the introduction of universal quantification and type functions is super, super, super useful. That's how we write a type like forall a . List a -> Maybe a or forall a . List a -> Int. It's worth taking a lot of care in restricting how rich a thing like Maybe a can be. If it's a completely general type function then you might be in for some difficulties as often we want inference to flow "backward" and completely general type functions break that.

Could you define "signatures"? Not sure that I understand what you mean.

Signature is nothing more than "type of module" in OCaml. For technical reasons, OCaml is made of two static languages. The first is "values and types" the second is "modules and signatures". Those pairings are completely analogous. So, for instance

signature COMPARE
  type T
  compare : T -> T -> Ordering
end

module IntOperations
  type T = Int
  compare : Int -> Int -> Ordering
  compare x y =  if (x <= y) then LT else (if (x == y) then EQ else GT)
  inc : Int -> Int
  inc x = x + 1
end

module IntCompare = IntOperations : COMPARE

defines a signature describing what modules implement an idea of comparison and a module which defines some operations on Int. Then, finally, we "restrict" our operations to a more specific module which only knows that IntOperations describes at least what COMPARE asks for.

These are all OCaml, SML, ML ideas. Indexed algebras are not. They're just an abstract algebra mathematical idea.

I think [using type representatives] works fine in current fantasy-land, because there it always tied to a single type as halve of the methods still live in prototype.

Yep! In that way, they really are serving a totally different purpose from these "dictionaries"/"modules" described here. They're useful as they bring the idea of passing in static types to runtime functions, but basically orthogonal.

Whatever name we choose we should keep in mind that these dictionaries may contain not only functions in the future.

This is also very true. What's perhaps more interesting is whether you believe they must contain values (functions or otherwise) which relate to the type at hand. For instance, consider this signature

signature
  type T
  inc : T -> T
  step : Int
end

Is step allowed? What does it mean?


A module usually relates to some static type

I've been using a trick this whole time where I actually introduce a type variable within the module. This means that signatures "bind" anonymous types and types of values together and then modules "inform" how those relations exist. Consider MONOID

signature MONOID
  type T
  zero : T
  mult : T -> T -> T
end

This let's me talk about the abstract idea of a Monoid in the same language for which I might write a module specific to addition

module Addition
  type Int
  zero = 0
  mult x y = x + y
end

module AdditionIsMONOID = Addition : MONOID

it also generalizes neatly to the modules which involve multiple types scenario.

Algebra

While this is true, you're going to quickly run into an issue as Algebra means both the concrete algebra (0, +) over integers as well as the general algebra (zero, mult) over some set implying a monoid structure. It's a really popular "pun" in mathematics to talk about those two things with the same word, casually ignoring the fact that they operate at two very distinct levels of generality.

Module/Signature resolves this in the same way that type systems always do.


Generally, I think this is all heading in a very positive direction. I think the idea for the project itself is great—many of the challenges of applying mathematical thinking to OO cannot really be resolved, so these external "modules"/"dictionaries"/whatever move to a richer way of talking about programming.

I've obviously been a big proponent of stealing from OCaml. Perhaps less obviously, I'm actually very happy with striking out in a new direction somewhat as Javascript is not OCaml and doesn't need to solve all of the same problems. Furthermore, it has new opportunities arising from not actually being type checked. We're not bound to strict inference algorithm restrictions as humans do our inference and are often dogged and ingenious.

But it's very worth studying the practical elements of OCaml's system since they've solved a lot of problems and landed in a very nice place. The more that experience can be absorbed the further this project can go.

gcanti commented 7 years ago

functions vs values

For example empty() / zero() may become just values

Note that in PureScript

class Semigroup m <= Monoid m where
  mempty :: m

mempty is polymorphic.

With functions, implementers may choose to be very strict in terms of type checking. Here's the empty definition of Maybe in flow-static-land

// Monoid.js
export function empty<A>(): Maybe<A> {
  return Nothing
}

then you get an error if you try to mess with different monoid units

import type { Monoid } from 'flow-static-land/lib/Monoid'
import { stringMonoid } from 'flow-static-land/lib/Monoid'
import type { Maybe } from 'flow-static-land/lib/Maybe'
import * as maybe from 'flow-static-land/lib/Maybe'

function f(m: Monoid<*>) {
  return m.empty()
}

const u1 = f(maybe.getMonoid(stringMonoid))
const u2: Maybe<number> = u1 // <= error: string. This type is incompatible with number

or you can choose to be more loose

// Monoid.js
export function empty(): Maybe<any> {
  return Nothing
}

If empty is a value the only choice is using any

// Monoid.js
export const empty: Maybe<any> = Nothing

meta field algebra

also we may need to add meta algebras

Not sure why, for runtime type checking? It seems an implementation detail, AFAIK flow-static-land doesn't need that.


concepts

Types are just sets (Flow/TypeScript syntax would be nice since we are targeting JavaScript)

An algebra Alg is a bunch of sets (A, B, ...) + a bunch of operations (op1, op2, ...) + a bunch of laws

Alg = (A, B, ..., op1, op2, ...) + laws

Note that in general an algebra is related to one or more sets (e.g. a vector space).

An algebra (OCaml signature, PureScript type class) is naturally represented and implemented in Flow/TypeScript (the blessed syntax?) as an interface (though there's a problem with higher kinded types)

interface Alg<A, B, ...> { // <= sets go here
  op1(); // <= operations go here
  op2();
  ...
}

An intance (OCaml module, PureScript instance) of the algebra Alg is an object implementing the algebra interface

tel commented 7 years ago

@gcanti Interfaces don't quite work the same as signatures since they cannot have type members. Type parameters do get a lot of the same work done, though.

Type's aren't really just sets. They cannot be for foundational reason and shouldn't be for logical reasons. But that said, informally it's not too bad to unify the two.

rpominov commented 7 years ago

@tel

[Definition of static types] is more or less necessary, though it can be done implicitly for a long time.

That makes sense, but still we may need more elaborated definitions to explain some concepts. But to try to avoid it and to use a vague definition for now sounds like a good idea.

Is step allowed? What does it mean?

This have to be allowed. Even now we require something like this for a lot of methods. Our methods may accept other types as arguments and produce other types. For example Setoid's equals produces booleans.

Maybe we should just omit notion of relation to a type in the "Module" section, and say that module is just a collection of values and functions. We can then introduce relation to a type in algebras. For example "module S supports Semigroup related to numbers if it has concat method that accepts numbers and the law S.concat(S.concat(a, b), c) ≡ S.concat(a, S.concat(b, c)) stands."

While this is true, you're going to quickly run into an issue as Algebra means both the concrete algebra (0, +) over integers as well as the general algebra (zero, mult) over some set implying a monoid structure. [...] Module/Signature resolves this

I still like "algebra" more. It's more familiar to people. Also signatures can be confused with method signatures, although we can call them "module signatures". And what also good about using "algebras" is that we can unite requirements for presence of methods and for laws in a single concept. But we also can define signatures in a way that includes laws. So yeah, worth considering!

@gcanti

If empty is a value the only choice is using any

Yeah, I'll remember that if we ever consider that change. Not considering it now, just wanted to mention as an example.

also we may need to add meta algebras

Not sure why, for runtime type checking? It seems an implementation detail, AFAIK flow-static-land doesn't need that.

This may be needed in libraries like Ramda. Also for automatic generation of derived methods. So yeah, basically runtime detection. Can be useful sometimes.

Types are just sets

One more issue with defining types as just sets is that we might need notion of types with type arguments, so we might need a more elaborated definition.

tel commented 7 years ago

@rpominov

This have to be allowed.

I agree, but it's an interesting case. In particular, I think mines more interesting than your setoid example—it has to do with values which are associated with a "module" that is associate with a type but these values have types which do not reference the associated type at all. Weird!

I agree that "module signature" and "function signature" can be a little confusing. That said, I typically don't think of function signatures as real things since the deeper idea is that "all names have types" and "functions are represented by names" so functions have types just the same as 1, "foo", and true do.

rpominov commented 7 years ago

So getting back to the original issue. We have "Dictionary" and "Module" as suggested names. Maybe someone has any other ideas?

We could also go with "Algebra" as @gcanti suggests, but this would require turning structure of specification inside out, and I just don't have a good picture of how it will look like.

rpominov commented 7 years ago

I've created a PR https://github.com/rpominov/static-land/pull/41 . Decided to use "module" after all, I think there shouldn't be a lot of confusion.