Attach routines to concepts

metagn commented 1 year ago

This was sitting in my Obsidian notes for a while so might have some issues.

Abstract

Allow declaring routines with an attached (atomic) concept/interface type that they implement for certain types.

Motivation

Kind of the same problem as #380: It's possible to have types in scope, but not their adjacent implementations of common interfaces like $ or hash. This causes problems for generic procs that need these implementations in scope in order to use them. Instead of making this a problem of scope, we could use the type system to link these implementations with atomic descriptions of these interfaces. Currently in the language this can be represented with the "atomic" concept types added in 1.6.

type Hashable = concept
  proc hash(x: Self): Hash

In addition to this, there is the problem of concepts depending on specific routine names. This can cause problems when different concepts use the same name for different intended behavior. It would be nice to include disambiguating information at the declaration site for situations like these.

Description

Say we have a Hashable concept as above. Then, allow writing something like (syntax for this whole post is temporary):

proc hash(x: T): Hash for Hashable =
  ...

# or

proc Hashable.hash(x: T): Hash =
  ...

# the dot version might be misleading because we should still be able to use the proc without qualifying
# but the idea is that it's "the `hash` proc from the `Hashable` concept"

The conditions for this to compile are:

this is an implementation of a component routine of the attached concept type (in this case hash) for some type T
either: the concept type is declared in the current scope/package, the type T implementing the concept is a nominal type declared in the current scope/package, or is a type "containing" such a nominal type (i.e. ref T, seq[T], Atomic[T], (int, T))
- if the proc is generic, this also extends to generic constraints, i.e. proc hash[T: Nominal](x: seq[T])
- type classes like Table[int | float, T], seq[T and Comparable], seq[T | U] where U is another such nominal type are fine, but Table[int | T, string], seq[not T] are not
this is not a redefinition in the "namespace" of the concept type

This proc can be used like any other proc, with a few additional behaviors:

this proc is added to the "namespace" of the attached concept, i.e. stored in an internal list of procs attached to the concept type
procs with the same signature but different implementations can be defined for different concept types

Then, when we do something like Hashable.hash(x), the hash overloads in the "namespace" of Hashable are considered as well as the procs in scope to find a matching overload of hash for x, with the procs in scope receiving priority (the syntax might be misleading for this though).

proc `[]`[K, V](t: Table[K, V], k: K): V =
  let h = Hashable.hash(k)
  ...

How this is different from #380:

Both attaching and using procs with this is explicit rather than automatic. This is a minor productivity hit but helps with clarity.
By using concepts, we don't need as many special rules for which type the proc is implemented for. So it doesn't have to be the first parameter, can be a nested complex type/typeclass etc.
- We can also have "default" implementations at the concept declaration.
Pretty compatible with the existing overload mechanism and symbol resolution in generics, shouldn't be difficult to implement or impact compiler performance. Also dead easy to cache.

The compiler can even make use of this to simplify and expose certain builtin overloading mechanisms if we declare special concepts in system that the compiler recognizes. Use cases might be:

Implicit items and pairs iterators
Lifetime hooks (=destroy, =copy, =sink...), without the "scope" behavior
Converters in general, maybe saving a keyword
default if we needed it

Yes this is like traits/typeclasses in other languages. But the meat of the feature is still Nim's overloading. We don't need it for every place that we use overloads.

Examples

# system.nim
type Stringable* = concept
  proc `$`(x: Self): string

proc echo*(args: varargs[typed, Stringable.`$`]) {.magic: Echo.}

# a.nim
type Foo* = ref object

proc `$`*(x: Foo): string for Stringable =
  "Foo"

# b.nim
import a # note `Foo` is not exported

proc getFoo*(): Foo =
  Foo()

# c.nim
import b # note `Foo` is not imported

echo getFoo() # Foo

arnetheduck commented 1 year ago

fwiw, we quite consistently use a similar style already to avoid polluting the global namespace, ie:

type MyType = object

proc someHelper(T: type MyType, ...) = ...

Canonical example here is Option and its some which takes up valuable global namespace estate - in results, this becomes Opt.some - a similar syntax for concepts could be considered as well:

proc hash(_: concept Hashable, ...): ...

that said, the above syntaxes are fine too, writing this merely to highlight the similarity / option ;)

metagn commented 1 year ago

Well the way the compiler would interpret it is more like (Hashable.hash)(x) (basically at some point it would behave the same as an nkOpenSymChoice). It would be fine for the declaration, but I don't think it would be possible to support calling like hash(Hashable, x) since we also want procs without the concept attachment to work. Something like this also should work:

var s = @[3, 2, 1]
s.sort(Comparable.cmp)

I didn't consider that this could break the method call syntax for calls that take a concept type for the first parameter as above, at least for the cases where we can find such a hash in the concept (like how obj.closureField() always calls the object field). Forcing (Hashable.hash)(x) would still break Hashable.hash if it had a meaning before. Using another operator than . would break its custom operators, I believe :: is reserved but it would be weird to introduce it here with no other use cases.

So I'm not sure, either we break existing code that contains Hashable.hash where Hashable is a new style concept and hash is the name of a routine in its declaration, or there's some less breaking syntax for "symbols related to the hash in Hashable" that I can't think of. In any case the workaround is hash(Hashable) anyway.

Araq commented 1 year ago

So it doesn't have to be the first parameter, ...

In the most recent version of my proposal I also removed the restriction to the first parameter as it works better without it.

Araq commented 1 year ago

This is a minor productivity hit but helps with clarity.

I don't agree. I think the "clarity" is already there in the existing code when you write proc hash(x: Foo): Hash and it's just that the language design misinterprets it slightly. ;-)

Also, when you do attach procs to types my way then you can also clean up the whole scope override story that happens in generics but shouldn't. But that's a story for another day...

metagn commented 1 year ago

What I meant to say was, it helps with clarity, with the current scoping rules. When you write proc hash it's not always clear that when you call some generic proc like tables.[], it only works because of the locally defined hash, or that it uses it at all. $ might be a better example for this than hash.

That being said this isn't necessarily exclusive from #380 or dependent on the scoping rules. If it still has additional benefits, like binding to complex types, it might be useful.

nim-lang / RFCs