Araq commented 5 years ago

Concept redesign

Goals:

Empower concepts to make generic code type-checkable at declaration time, not only at instantiation time.
Make concepts easier to understand and easier to implement.
Provide an escape hatch in generic code so that e.g. adding a debug echo or log statement does not require a far reaching type constraint addition.
Old code with unconstrainted generic parameters keeps to work.
Do not base concept's implementation on system.compiles which is under-specified in Nim, very tied to the current implementation, and finally, slow.

Non goals:

Support every little detail that the old concept design supported. For this we have the escape hatch plus the fact that you can leave your T underspecified.
Support accidentical features. ("Look! With this hack I can specify that the proc needs a cdecl calling convention!")
Support "cross parameter" constraints like "sizeof(a) == sizeof(b)". I have yet to see convincing examples of these cases. In the worst case, we could add enableif for this without bloating the concept's design.
Turning concepts into "interfaces". This can already be accomplished with macros. Having said that, since these concepts are declarative, they are easier to process with tooling and arguably easier to process with macros as well.

Atoms and containers

Concepts come in two forms: Atoms and containers. A container is a generic concept like Iterable[T], an atom always lacks any kind of generic parameter (as in Comparable).

Syntactically a concept consists of a list of proc and iterator declarations. There are 3 syntatic additions:

Self is a builtin type within the concept's body stands for the current concept.
each is used to introduce a generic parameter T within the concept's body that is not listed within the concept's generic parameter list.
either orelse is used to provide basic support for optional procs within a concept.

We will see how these are used in the examples.

Atoms


  type
    Comparable = concept # no T, an atom
      proc cmp(a, b: Self): int

    ToStringable = concept
      proc `$`(a: Self): string

    Hashable = concept
      proc hash(x: Self): int
      proc `==`(x, y: Self): bool

    Swapable = concept
      proc swap(x, y: var Self)

Self stands for the currently defined concept itself. It is used to avoid a recursion, proc cmp(a, b: Comparable): int is invalid.

Containers

A container has at least one generic parameter (most often called T). The first syntactic usage of the generic parameter specifies how to infer and bind T. Other usages of T are then checked to match what it was bound to.


  type
    Indexable[T] = concept # has a T, a collection
      proc `[]`(a: Self; index: int): T # we need to describe how to infer 'T'
      # and then we can use the 'T' and it must match:
      proc `[]=`(a: var Self; index: int; value: T)
      proc len(a: Self): int

Nothing interesting happens when we use multiple generic parameters:


  type
    Dictionary[K, V] = concept
      proc `[]`(a: Self; key: K): V
      proc `[]=`(a: var Self; key: K; value: V)

The usual ": Constraint" syntax can be used to add generic constraints to the involved generic parameters:


  type
    Dictionary[K: Hashable; V] = concept
      proc `[]`(a: Self; key: K): V
      proc `[]=`(a: var Self; key: K; value: V)

each T

Note: each T is currently not implemented.

each T allows to introduce generic parameters that are not part of a concept's generic parameter list. It is furthermore a special case to allow for the common "every field has to fulfill property P" scenario:


  type
    Serializable = concept
      iterator fieldPairs(x: Self): (string, each T)
      proc write(x: T)

  proc writeStuff[T: Serializable](x: T) =
    for name, field in fieldPairs(x):
      write name
      write field

either orelse

Note: either orelse is currently not implemented.

In generic code it's often desirable to specialize the code in an ad-hoc manner. system.addQuoted is an example of this:


  proc addQuoted[T](dest: var string; x: T) =
    when compiles(dest.add(x)):
      dest.add(x)
    else:
      dest.add($x)

If we want to describe T with a concept we need some way to describe optional aspects. either orelse can be used:


  type
    Quotable = concept
      either:
        proc `$`(x: Self): string
      orelse:
        proc add(s: var string; elem: self)

  proc addQuoted[T: Quotable](s: var string; x: T) =
    when compiles(s.add(x)):
      s.add(x)
    else:
      s.add($x)

More examples

system.find

It's straight-forward:


  type
    Findable[T] = concept
      iterator items(x: Self): T
      proc `==`(a, b: T): bool

  proc find(x: Findable[T]; elem: T): int =
    var i = 0
    for a in x:
      if a == elem: return i
      inc i
    return -1

Sortable

Note that a declaration like


  type
    Sortable[T] = Indexable[T] and T is Comparable and T is Swapable

is possible but unwise. The reason is that Indexable either contains too many procs we don't need or accessors that are slightly off as they don't offer the right kind of mutability access.

Here is the proper definition:


  type
    Sortable[T] = concept
      proc `[]`(a: var Self; b: int): var T
      proc len(a: Self): int
      proc swap(x, y: var T)
      proc cmp(a, b: T): int

Concept matching

A type T matches a concept C if every proc and iterator header H of C matches an entity E in the current scope.

The matching process is forgiving:

If H is a proc, E can be a proc, a func, a method, a template, a converter or a macro. E can have more parameters than H as long as these parameters have default values. The parameter names do not have to match.
If H has the form proc p(x: Self): T then E can be a public object field of name p and of type T.
If H is an iterator, E must be an iterator too, but E's parameter names do not have to match and it can have additional default parameters.

Escape hatch

Generic routines that have at least one concept parameter are type-checked at declaration time. To disable type-checking in certain code sections an untyped block can be used:


proc sort(x: var Sortable) = 
   ... 
   # damn this sort doesn't work, let's find out why:
   untyped:
      # no need to change 'Sortable' so that it mentions '$' for the involved
      # element type!
      echo x[i], " ", x[j]

EDITED 2021/03/09 self was renamed to Self and is what the experimental implementation uses.

andreaferretti commented 5 years ago

I would like to understand how to write the concepts in Emmy under this proposal. Currently they look like

type
  AdditiveMonoid* = concept x, y, type T
    x + y is T
    zero(T) is T
  AdditiveGroup* = concept x, y, type T
    T is AdditiveMonoid
    -x is T
    x - y is T
  MultiplicativeMonoid* = concept x, y, type T
    x * y is T
    id(T) is T
  MultiplicativeGroup* = concept x, y, type T
    T is MultiplicativeMonoid
    x / y is T
  Ring* = concept type T
    T is AdditiveGroup
    T is MultiplicativeMonoid
  EuclideanRing* = concept x, y, type T
    T is Ring
    x div y is T
    x mod y is T
  Field* = concept type T
    T is Ring
    T is MultiplicativeGroup

I guess they would become like this?

type
  AdditiveMonoid* = concept
    proc `+`(x, y: self): self
    proc zero(T: typedesc[self]): self
  AdditiveGroup* = concept
    self is AdditiveMonoid # will concept refinement be allowed?
    proc `-`(x: self): self
    proc `-`(x, y: self): self
  MultiplicativeMonoid* = concept
    proc `*`(x, y: self): self
    proc id(T: typedesc[self]): self
  MultiplicativeGroup* = concept
    self is MultiplicativeMonoid
    proc `/`(x, y: self): self
  Ring* = AdditiveGroup and MultiplicativeMonoid
  EuclideanRing* = concept
    self is Ring
    proc `div`(x, y: self): self
    proc `mod`(x, y: self): self
  Field* = Ring and MultiplicativeGroup

Araq commented 5 years ago

I guess they would become like this?

Yes.

self is AdditiveMonoid # will concept refinement be allowed?

I thought about it and we could support it via type AdditiveGroup = concept of AdditiveMonoid. I don't know if it is important enough that we really need it.

mratsim commented 5 years ago

I like the proposal. Can it deal with recursivity? That's important for my needs (https://github.com/mratsim/Arraymancer/issues/299)

In Arraymancer I define a deep learning trainable layer like this but I don't use them at the moment as it's too brittle:

https://github.com/mratsim/Arraymancer/blob/bde79d2f73b71ece719526a7b39f03bb100784b0/src/nn_dsl/dsl_types.nim#L63-L81

type
  Variable[T] = object
    value: Tensor[T]
    gradient: Tensor[T]

  Tensor[T] = object
    data: seq[T]

  TrainableLayer*[TT] = concept layer
    block:
      var trainable: false
      for field in fields(layer):
        trainable = trainable or (field is Variable[TT])
      trainable

  Conv2DLayer*[TT] = object
    weight*: Variable[TT]
    bias*: Variable[TT]
  LinearLayer*[TT] = object
    weight*: Variable[TT]
    bias*: Variable[TT]
  GRULayer*[TT] = object
    W3s0*, W3sN*: Variable[TT]
    U3s*: Variable[TT]
    bW3s*, bU3s*: Variable[TT]
  EmbeddingLayer*[TT] = object
    weight*: Variable[TT]

#########
# sanity check

var convolution: Conv2DLayer[float32]
echo convolution is TrainableLayer[float32] # Why false?

#########
# user-defined

type
  MyNonLayer = object
    foo: int
    bar: float32
  MyLayer[TT] = object
    metadata: MyNonLayer
    weight: Variable[TT]

var x: MyNonLayer
var y: MyLayer[float32]

echo x is TrainableLayer[float32]
echo y is TrainableLayer[float32] # Why false

Araq commented 5 years ago

What does that even mean though? "Somewhere in the object type there must be a field of type Variable[T]"? How do you know which one? What if you have more than one?

zah commented 5 years ago

A) How come I must repeat in every discussion about concepts that any proposal must cover the requirements of associated types and constants? I guess it would have to be:

type Foo = concept
  proc ElemType(_: typedesc[self]): typedesc
  proc compressedSize(_: typedesc[self]): static[int]

I find it a bit funny how these result in syntax that is outlawed in regular Nim. And hey, why was proc chosen here? Isn't using func the recommended way for new Nim code?

B) A "container concept" must be able to infer static parameter values as well. For example

type FixedSizeMatrix[M, N: static[int]; T] = concept
  # How do I infer M and N here from a non-generic type?
  # In the current design it's done with:
  x.Rows == M
  x.Cols == N

Please not that Rows and Cols can be templates defined over a non-generic type in order to adapt it to the concept requirements:

template Rows(T: type Matrix4x4): int = 4

Having the FixedSizeMatrix concept above, how do I define a different concept called SquareMatrix (where the rows and columns must be the same)? There are some algorithms that work only on square matrices.

C) The VTable types are a very important feature in the current concepts design, but I would agree that they can be delivered as a library.

D) I'd really like to have the converter type classes though and these require custom code in semcall/sigmatch.

E) Regarding the main idea here, mainly the type checking of generic functions, I would approach this with a lot of caution. The internal compiler structures can be defined and derived from the current concept syntax. I think it makes a lot of sense to try to experiment with the approach before making any major decisions about removing the existing concepts-related code in the compiler.

In particular, a lot of attention has to be put into code like the following:

https://github.com/status-im/nim-beacon-chain/blob/master/beacon_chain/ssz/types.nim#L85-L101

func isFixedSize*(T0: type): bool {.compileTime.} =
  mixin toSszType, enumAllSerializedFields

  when T0 is openarray|Option|ref|ptr:
    return false
  else:
    type T = type toSszType(default T0)

    when T is BasicType:
      return true
    elif T is array:
      return isFixedSize(ElemType(T))
    elif T is object|tuple:
      enumAllSerializedFields(T):
        when not isFixedSize(FieldType):
          return false
      return true

Notice how the function accepts an underspecified type, but it discovers a more detailed type using the is operator. Then the function is able to use operations defined over this more detailed type. This is a very convenient construct enabled by the current lax rules of Nim and we should not lose it.

mratsim commented 5 years ago

What does that even mean though? "Somewhere in the object type there must be a field of type Variable[T]"? How do you know which one? What if you have more than one?

I don't care which field, but as long as there is a "Variable[T]" field, the user-defined type constitutes a TrainableLayer indeed. All deep-learning libraries work like that, except that they use dynamic dispatch and inheritance from a Layer class.

Araq commented 5 years ago

How come I must repeat in every discussion about concepts that any proposal must cover the requirements of associated types and constants?

Because it's covered by the proposal as you noticed yourself.

Isn't using func the recommended way for new Nim code?

func could be allowed but then require func'ness in the matching process. Though I think it doesn't matter. ;-)

Static T could be done like so:


type FixedSizeMatrix[M, N: static[int]; T] = concept
  proc rows(x: self): M
  proc cols(x: self): N

I'd really like to have the converter type classes though and these require custom code in semcall/sigmatch.

Well the proposal implies that we have dedicated, custom code for concept matching, moreso than currently.

I think it makes a lot of sense to try to experiment with the approach before making any major decisions about removing the existing concepts-related code in the compiler.

Of course.

In particular, a lot of attention has to be put into code like the following:

That code is not affected, under-constrained generics continue to exist and your isFixedSize does not use a concept. I should stress that only real concepts that use the concept keyword trigger the type-checking for generics. Of course, we could also come up with a different rule when exactly this checking is triggered, we need to tinker with it -- a lot.

zah commented 5 years ago

  proc rows(x: self): M
  proc cols(x: self): N

Hrm, you are using a static value as a return type. Seems like an irregularity to me.

My underspecified code shouldn't be ruled out as "non using concepts". I could have easily constrained the input type with a concept such as SszSeriazable (or any other type class for that matter). It's just a lucky coincidence that this particular example is not doing this. My general remark is that the pattern of using when X is Y inside generics is a very nice alternative to overloading in many situations and the type checking engine should accommodate it.

Araq commented 5 years ago

Well the eternal question is "is static[int] a type or a value?". ;-) My RFC assumes it's actually a type.

Araq commented 5 years ago

My underspecified code shouldn't be ruled out as "non using concepts". I could have easily required a concept such as SszSeriazable in place of the auto type there (or any other type class for that matter).

Well no, under this RFC it couldn't "easily require" such a thing. If you cannot be precise about the generic type constraints, don't fake it with an imprecise concept, use an unconstrained generic instead. If you don't like this, fine, but that it is what this RFC says.

zah commented 5 years ago

Well the eternal question is "is static[int] a type or a value?". ;-) My RFC assumes it's actually a type.

I've never understood why you insist that values bound to static parameters are "types". Can they be passed to functions accepting regular values - yes. When you initialize a constant from a static[int] parameter, what's the type of the constant - it's int. Where should be the compilation error in the following code:

proc foo(N: static int) =
  const N2 = N
  proc bar(a: N) = discard
  proc baz(a: N2) = discard

For me, the "eternal" question has a very definitive answer. The static parameters are values having tyStatic as a modifier (just like the var parameters are values having tyVar as a modifier).

zah commented 5 years ago

Here is some more fun. If we continue with the "static parameters are types" thinking, we reach the following:

type 3DTransform = concept
  self is FixedSizeMatrix 
  proc cols(x: self): 4
  proc rows(x: self): 4

Araq commented 5 years ago

Here is some more fun...

I'm not sure you're asking the right question. You are thinking about whether these alternative concepts can support what the current concepts can. But the real question is why does it matter to infer the value 4 in order to type-check a generic? And how often does it matter to be able to do this?

Araq commented 5 years ago

I don't care which field, but as long as there is a "Variable[T]" field, the user-defined type constitutes a TrainableLayer indeed. All deep-learning libraries work like that, except that they use dynamic dispatch and inheritance from a Layer class.

Why not use this:


type
  TrainableLayer[T] = concept
    proc variableT(x: self): Variable[T]

# implementation

type
  MyLayer = object
     f: Variable[int]

template variableT(x: MyLayer): Variable[int] = x.f

andreaferretti commented 5 years ago

Because there are many variables, not just one. One actually wants to express a (fixed shape) tree whose leaves are variables

Araq commented 5 years ago

Ok, how about:


type
  TrainableLayer = concept
    iterator variables(x: self): Variable[each T]

# implementation

type
  MyLayer = object
     f: Variable[int]

iterator variables(x: MyLayer): Variable[int] = yield x.f

andreaferretti commented 5 years ago

I guess that could work, but I leave the final word to @mratsim , who has actually tried to put the idea into practice

zah commented 5 years ago

But the real question is why does it matter to infer the value 4 in order to type-check a generic? And how often does it matter to be able to do this?

This example was not inferring 4, it was requiring it. A 3D transform is an affine transformation matrix with dimensions 4 by 4. There are various algorithms that work only on such matrices, so I would say it's a perfect example for a requirement that a generic 3G graphics library might have.

mratsim commented 5 years ago

It needs to support a variadic number of layers like

type
  Conv2DLayer*[TT] = object
    weight*: Variable[TT]
    bias*: Variable[TT]
  LinearLayer*[TT] = object
    weight*: Variable[TT]
    bias*: Variable[TT]
  GRULayer*[TT] = object
    W3s0*, W3sN*: Variable[TT]
    U3s*: Variable[TT]
    bW3s*, bU3s*: Variable[TT]
  EmbeddingLayer*[TT] = object
    weight*: Variable[TT]

User-defined

Residual NN layers: https://www.quora.com/How-does-deep-residual-learning-work

type
  ResidualBlock*[TT] = object
    conv1*: Conv2DLayer[TT]
    conv2*: Conv2DLayer[TT]

The concept should be able to match with TrainableLayer[TT] provided with this type definition. Anything that requires creating a proc or an additional level of indirection at the type level is a lot of friction compared to Python.

At most we can have this:

type
  ResidualBlock*[TT] = object
    conv1*{.trainable.}: Conv2DLayer[TT]
    conv2*{.trainable.}: Conv2DLayer[TT]

but I'm not sure how to make a concept look for a pragma.

Araq commented 5 years ago

It doesn't sound like we're gonna be able to type-check "somewhere in there Variable[TT] fields will be processed somehow" anytime soon. That doesn't mean the idea of type-checking generics with concepts is pointless... ;-)

dom96 commented 5 years ago

Nice to see that this is basically what I proposed in #167 + each, self, either orelse :). Of course, you've obviously thought far more about the semantics.

Personally I would still do away with the self, mainly because it's yet another built-in type and reusing the concept name makes more sense to me:

  type
    Comparable = concept # no T, an atom
      proc cmp(a, b: Comparable): int

    ToStringable = concept
      proc `$`(a: ToStringable): string

    Hashable = concept
      proc hash(x: Hashable): int
      proc `==`(x, y: Hashable): bool

    Swapable = concept
      proc swap(x, y: var Swapable)

To be honest I'm really after simplicity here, to me introducing any new custom syntax is going too far. Because of this I would simply omit each as well (it's a very specific case and I wonder how many other languages actually support it, Rust at least doesn't), why don't we start with the basics first?

For the elseor case I can see this being possible and quite intuitive too:

  type
    Appendable = concept
      proc add(s: var Appendable, elem: Appendable)

    ToStringable = concept
      proc `$`(x: ToStringable): string
    Quotable = Appendable or ToStringable

But all in all, I still prefer this over the current concept syntax even if it does include these additional elements.

krux02 commented 5 years ago

First of all, nice to see this work being done. At first sight it looks clean. Regarding the either/orelse, I don't think it should be supported, at least not because of the gives use case. Generally it is much better to just require one operation, in this case it would be add. Then, when a concept doesn't match because of a missing add, it is trivial to add this overload to the type. It is just for historic reasons that this is not done in the standard library.

type
  Quotable = concept
      proc add(s: var string; elem: self)
proc addQuoted[T: Quotable](s: var string; x: T) =
  s.add(x)

type
  MyType = object
var str: string
var mt: MyType
proc add(s: var string; elem: MyType) =
  s.add $elem
str.addQuoted(mt)  # without the previous declaration: error

zah commented 5 years ago

If we zoom out a bit, a concept can represents two distinct set of requirements:

1) Supported syntax requirements This is a set of required expressions that the compiler must be able to resolve in the context of a generic function after an overload has been selected. If we know this set, we can type check the definition of the generic function before it's instantiated.

2) Type predicates These are arbitrary facts that must be true for the input types before we consider them eligible for overload selection.

This proposal focuses entirely on the first type of requirements while ignoring the second type. By doing this, you only lose expressivity and nothing is gained in return. Gathering the set of syntax requirements is all you need to type check generics and it's a false assumption to believe that this is somehow hindered by the current approach to concepts (the set of requirements is well defined - it's exactly what would be needed in order to compile a generic function with the same body as the concept).

If one really insists on using more declarative syntax, I'd suggest at least the following additions:

Support of as a mechanism for refining concepts
Support == as a mechanism for inferring static parameter values
Support assert-like statements as a way to specify arbitrary predicates for the types
Support return-like statements as a way to define converter type classes (e.g. StringConvertible)

Some optional points to consider:

Consider using a different keyword such as callable to reduce the proc/template confusion
Think about how custom error messages can be delivered
Think about conditional compilation (when defined(foo): ...)

Araq commented 5 years ago

By doing this, you only lose expressivity and nothing is gained in return.

Er, it allows us to avoid the guesswork we do in the "generic prepass" with its ideas of "bind" vs "mixin" symbols. That is a gain. Also, proper type-checking for generics is done by C#, Scala, Rust, etc -- the C++'s template solution so far only has been copied by Nim and D, both languages which lacked an expert on type theory for their initial designs.

These are arbitrary facts that must be true for the input types before we consider them eligible for overload selection.

Yeah, well, "arbitrary facts" are dangerous. And as I said, if the concept cannot describe the type precisely, there is always the under-specified generic T that can be used.

If one really insists on using more declarative syntax, I'd suggest at least the following additions: ...

Yeah, agreed, especially inferring static values needs to get special support.

zah commented 5 years ago

By "facts", I meant the predicates that the existing concepts support which doesn't necessarily concern the availability of certain overloads. Requiring a square matrix is an example for this. I meant that by not supporting such predicates you only lose expressivity.

Er, it allows us to avoid the guesswork we do in the "generic prepass" with its ideas of "bind" vs "mixin" symbols. That is a gain. Also, proper type-checking for generics is done by C#, Scala, Rust, etc -- the C++'s template solution so far only has been copied by Nim and D, both languages which lacked an expert on type theory for their initial designs.

I also tried to explain that the desired type checking is possible with or without such predicates. It's also possible with the current concepts, so this message is a bit of a straw man argument.

andreaferretti commented 5 years ago

Requiring a square matrix is an example for this. I meant that by not supporting such predicates you only lose expressivity.

My standard techniques to do this is to create a distinct type with private fields, and ensure that the only way to construct it is via some function that checks the predicate. Some sugar for this would go a long way

sighoya commented 4 years ago

As someone who isn't familiar with Nim, what does each T mean? Why is each T marked in one signature and in the other not?

Further, is there any chance of nominal concepts ala type implements concept? I could imagine that it is easier for the user to know when a type implements some concept and also for the compiler to know that the type is compatible to the concept after typechecking the implementation once instead to do it every time at the call site.

gemath commented 4 years ago

I like the proposal except for these issues:

either orelse is yet another conditional syntax beside if and when.
It seems to limit itself to defining API requirements and looks like an interface, but doesn't want to be mistaken for one.
Not differentiating between a publicly accessible object field and a unary proc with the same name is not always what's intended.

These could be solved with some modifications:

New rule: a concept is a list of expressions which must all evaluate to true at compile time for a type to match.
Introduction of an implemented macro which takes the block of proc and iterator declarations which is the concept body in the original proposal as an argument. It returns true if a type matches the argument as specified in the OP, with the following exception:
A public object field no longer matches a unary proc declaration of the same name.

Code example:

  type
    Dictionary[K, V] = concept
      implemented:
        proc `[]`(a: self; key: K): V
        proc `[]=`(a: var self; key: K; value: V)

      # either orelse is no longer needed
      implemented( proc capacity(d: self): int ) or
        implemented( proc size(d: self): int ) or self.hasField(size, int)

      # any other compile-time test, if necessary even
      compiles(someExpression)

      # ... or some other awesome test that doesn't even exist now.

Less compact than the OP, but with less non-obvious rules and much more flexible.

Araq commented 4 years ago

either orelse is yet another conditional syntax beside if and when.

Well it's not the same as when. Using the same keywords for everything is faking simplicity.

It seems to limit itself to defining API requirements and looks like an interface, but doesn't want to be mistaken for one.

No idea what that means. How else would you limit generic types?

Not differentiating between a publicly accessible object field and a unary proc with the same name is not always what's intended.

Why not?

Less compact than the OP, but with less non-obvious rules and much more flexible.

So effectively you want the old concepts that we already have. But I gave you a thumbs up, it's growing on me.

gemath commented 4 years ago

Well it's not the same as when. Using the same keywords for everything is faking simplicity.

Valid point. What I meant was: it would be better to avoid a situation where another conditional construct is even needed.

From the OP:

Syntactically a concept consists of a list of proc and iterator declarations.

As I understand it, this would limit concepts to type matching by API (independently of genericity). I would like to be able to specify criteria like "name of the type starts with an 'X'" or something wild I cannot even think of now.

Not differentiating between a publicly accessible object field and a unary proc with the same name is not always what's intended.

Why not?

Imagine we want to use procs from the API of a type as callbacks, e.g. Field access is not what we want then. Plus, it would be a non-obvious thing people "just have to know".

So effectively you want the old concepts that we already have.

Flexibility-wise, yes. Syntactically, I like your proposal better. I'm sure there were good reasons for making concept syntax as it is, but using types for parameter values like in someProc(string) is int just feels wrong to me now.

Another difference between existing concepts and my proposal is the test for the statements in a concept body: right now it is "it compiles and if it has a bool value, that value is true", so I never really know whether a statement doesn't compile because I screwed it up or because the test works as intended. My proposal drops the "it compiles" part, where needed I want to use compiles explicitly.

Araq commented 4 years ago

Imagine we want to use procs from the API of a type as callbacks, e.g. Field access is not what we want then. Plus, it would be a non-obvious thing people "just have to know".

That's a valid point.

My proposal drops the "it compiles" part, where needed I want to use compiles explicitly.

Yes, it's nice.

zah commented 4 years ago

Another difference between existing concepts and my proposal is the test for the statements in a concept body: right now it is "it compiles and if it has a bool value, that value is true", so I never really know whether a statement doesn't compile because I screwed it up or because the test works as intended. My proposal drops the "it compiles" part, where needed I want to use compiles explicitly.

Any concepts design needs to consider how the user is going to get diagnostics for non-matching concepts. The current implementation hasn't done enough to make this easy for the user. My long term vision is that the IDE itself will offer an interactive version of the explain feature that will be able to tell you precisely why a concept wasn't matched (you place your cursor over a particular call-site and you ask nimsuggest to provide an explanation).

I don't see how introducing an explicit compiles section helps. Please describe the error message that the user will be seeing for some specific non-matching concepts and then try to answer why the same message cannot be delivered with the current syntax.

But I like the fact that we hear more voices supporting the importance of the predicates within the concepts. This was one my main points in the discussions above.

gemath commented 4 years ago

My long term vision is that the IDE itself will offer an interactive version of the explain feature that will be able to tell you precisely why a concept wasn't matched

No doubt explain can deliver any amount of information about why a match failed. My problem is that the way how this info is provided breaks basic compiler behavior: if I misspell a proc name, I want the compiler to abort with an error, no matter whether the typo occurred in a concept body or elsewhere.

As for IDE support: I'm all for it, but don't make it a requirement for handling basic compiler behavior like error reporting. I wouldn't want to deal with special cases in build tools / CI runs because the compiler doesn't error out on a simple typo sometimes (i.e. when the typo occurs in a concept body).

I don't see how introducing an explicit compiles section helps.

It doesn't. Using compiles explicitly is not a goal in my proposal, it's a consequence of the relaxed matching rule for types against concept body expressions, which is what I really want.

Hell, if I'm honest I want a concept to be nothing more than a compile-time filter proc the compiler uses for type matching. The downside would be that we would need and operators between the body statements, the upside would be that your concept-local types and constants could occur naturally in the body of that proc.

But I like the fact that we hear more voices supporting the importance of the predicates within the concepts.

Yes, a narrower concept definition could close the door on powerful stuff we don't even think of now.

zah commented 4 years ago

There have been proposals for defining the concepts as plain old predicate functions applied to the matched type, but this misses another important point. In generic concepts such Container[T] or Enumerable[T], the concept body is used to infer the T parameter:

proc max[T](it: Enumerable[T]): T =
  var max = low(T)
  for val in items(it):
    ...

This form of inference can introduce relations and dependencies between multiple required procs in the interface. It's hard and awkward to express this with a simple predicate function.

In any concept proposal, typos are hard to report automatically (unless you go for heuristics such as detecting the typos with metrics such as Levenshtein distance). The simple reason for this is that it's appropriate to report errors only when there wasn't any match at the call site. Since it's perfectly normal for the concepts to not match, all failures have to be ignored when we have a match. The concepts are often used to describe more specific overloads and thus a more general matching overload can hide the typos in a more specific concept. I claim that there is no difference between the more declarative style suggested in this proposal and the current syntax when it comes to producing error messages.

The IDE can help here by offering mouse-over tooltips revealing which overload is selected at a particular call site (or though the go-to-definition feature). Some heuristic warnings are also possible - e.g. warn about a concept that was nearly matched, or an overload that was never matched.

gemath commented 4 years ago

@zah Read up on some of the other concept related issues. In addition to your comment that gives:

in a concept predicate, the difference between "typo in a proc name which should make the compiler error out" and "proc which doesn't exist in a specific calling context, but might well in another" exists only in my head and is undetectable to the compiler.
these discussions seem to regularly evolve into you having to defend the requirements of a wider concept definition (WC) against those who favor a more narrow concept definition (NC) and people like me who have a wish list of concept features but no deep understanding of Nims type resolution algorithm (sry for that).
we're in a vicious circle right now: fully functional concepts would benefit the source code of the compiler itself, I imagine by making it more concise and therefore more maintainable / extensible, but it is already too complex to fix the existing WC implementation. We have a number of smart people here, and the fact that none of them really touched the many concept related issues makes me doubt that this is merely a matter of throwing dev time at it. A bit scary.

So, could an NC like the one proposed in the OP, using a different name than "concept",

substantially benefit the compiler source code
later be deprecated, automatically translating NCs to WCs ("real" concepts), when WCs are (EDIT: fully) implemented in the compiler ?

zah commented 4 years ago

There is another way to deal with typos that I forgot to mention. For a lack of better name, I call it "explicit membership check". After you define your type and its operations, you can say the equivalent of:

assert MyType is Enumerable[int] # actual syntax may be slightly different

This will force the compiler into the error-producing mode and any typos will be properly reported.

The existing implementation of concepts is not complex to fix. It only requires someone to put the hours into working on it. Ideally, this someone would be me, but some external factors beyond my control currently prevent me from doing this. I'm deeply sorry for this.

disruptek commented 4 years ago

 type
    Quotable = concept
      either:
        proc `$`(x: self): string
      orelse:
        proc add(s: var string; elem: self)

  proc addQuoted[T: Quotable](s: var string; x: T) =
    when compiles(s.add(x)):
      s.add(x)
    else:
      s.add($x)

I'm not a fan of either/orelse. Five lines of code turned into eleven. What did we gain from this?

Clyybber commented 4 years ago

FWIW here is a prototype of this implemented macro outlined here https://github.com/nim-lang/RFCs/issues/168#issuecomment-593410686:

macro implemented(d: untyped): untyped =
  proc implementedImpl(d: NimNode): NimNode =
    result = nnkCall.newNimNode()
    result.add d[0]
    for i in 1..<d[3].len:
      for j in 0..<d[3][i].len - 2:
        result.add d[3][i][^2]
    if d[3][0].kind != nnkEmpty:
      result = nnkInfix.newTree(
        ident"is",
        result,
        d[3][0]
      )
  case d.kind
  of nnkStmtList, nnkStmtListExpr:
    result = d
    for i, p in d:
      result[i] = implementedImpl(p)
  of nnkProcDef:
    result = implementedImpl(d)
  else: assert false

expandMacros:
  type C = concept self
    implemented:
      proc cmp(a, b: self): int
      proc something(a: self)

https://play.nim-lang.org/#ix=2vOs

codehz commented 3 years ago

I think self should be avoid, or at least can be renamed in syntax... or it will cause confuse when people use a tuple[self: X] or proc(self: X) (consider self is not a keyword or reserved, so it can be freely used in anywhere that expected a nnkIdent) (and I don't think add more and more words into keyword list is a good idea)

ghost commented 3 years ago

@codehz FWIW https://github.com/nim-lang/Nim/pull/15251/ uses Self instead of self so your issue won't really be a problem. (the compiler also checks that the Self is the one it expects to see - https://github.com/nim-lang/Nim/blob/devel/compiler/concepts.nim#L36)

ornamentist commented 3 years ago

I guess they would become like this?

Yes.

self is AdditiveMonoid # will concept refinement be allowed?

I thought about it and we could support it via type AdditiveGroup = concept of AdditiveMonoid. I don't know if it is important enough that we really need it.

Would Emmy style concepts be possible without some kind of concept refinement?

Araq commented 3 years ago

Is there more to concept refinement than duplicating the concepts constraints (aka the concept's body) in a different concept?

ornamentist commented 3 years ago

I don't understand Nim well enough yet to answer that question unfortunately.

If refinement is not available though I'd like to avoid the situation where a "chain" of refined concepts has the same block of constraints copy-pasted in at every member of the chain.

For example, in a library like Emmy I could have concepts for Magma, Semigroup, Monoid, Group, AbelianGroup, Ring, CommutativeRing and Field, each of which is refinement of the previous concept. The Magma constraints would be repeated seven times, the Semigroup constraints six times etc across the chain, with a corresponding increase in maintenance and readability issues.

I'm not sure if that's how the new concepts proposal would work in practice though?

Araq commented 3 years ago

Copy and paste can always be avoided via Nim's template mechanism anyway.

sighoya commented 3 years ago

Why do copy & pasting at all, doesn't it suffice to point just to the constraints of the concepts that need to be satisfied/implemented too?

Question:

Is a concept a compiled time only abstraction or could we use them for boxing too, i.e. we know there is a type satisfying a concept, but we don't know what it is, and we may want to change this type at runtime. Also, concept refinements would imply a subtype relationship.
Does Self refer to the implementing type of concept or to the concept type itself. I find the former better in builder patterns and operations requiring the same underlying type as it is the case for type safe equality.

andreaferretti commented 3 years ago

I agree that refinement is not strictly needed, and copy/paste can be avoided via templates. I guess one could also implement refinement via a type macro, that just substitutes the body of the refined concept in the right spot. Still, conceptually is not bad to have it, if it can be supported with low cost.

konsumlamm commented 3 years ago

This is just a collection of my thoughts about concepts so far:

I find the argument that refinement is not needed, because it can be done with templates a bit weird. Yes, you could copy/paste all the declarations by defining them inside a template, but that would look something like

template common() = ...

type
  A = concept
    common()
  B = concept
    common()
    ...

which isn't very ergonomic imo and it should be really easy to implement in the compiler (just leave the copy/paste to the compiler, or point to the parent concept).

Another thing that I find a bit weird is that proc/func means "any routine" in the context of a concept. I'd prefer some special syntax (routine test(a: int)? test(a: int)?) or that any routine declaration (i.e. func, proc, template, macro etc.) is allowed and they each mean "this or something less powerful", so that func only accepts funcs, proc accepts procs and funcs, template accepts templates, procs and funcs, and so on.

Regarding associated types and constants (as they exist in Rust, for example), I'd love to see them, but I'm not sure how that's supposed to work, since items aren't bound to types in Nim (refs #380). I tried to simulate them using a proc that takes a typedesc[Self] parameter, but that doesn't seem to work currently:

type
  Associated = concept
    proc T(self: typedesc[Self]): typedesc
    proc A(self: typedesc[Self]): static[int]

proc T(self: typedesc[int]): typedesc = bool
proc A(self: typedesc[int]): static[int] = 42

static: assert int is Associated # Error

Finally, I think when should be supported in a concept body, like it is possible in other type definitions (objects, enums, ...).

DanielSherlock commented 2 years ago

@dom96 on 10 Sep 2019:

For the elseor case I can see this being possible and quite intuitive too:

  type
    Appendable = concept
      proc add(s: var Appendable, elem: Appendable)

    ToStringable = concept
      proc `$`(x: ToStringable): string
    Quotable = Appendable or ToStringable

I like this option, where the syntax is kept as unexceptional as possible, a lot. On refining concepts, would a similar approach do the trick?

(modifying the example given by @andreaferretti)

  type
    AdditiveMonoid* = concept
      proc `+`(x, y: Self): Self
      proc zero(T: typedesc[Self]): Self
    AdditiveGroup* = AdditiveMonoid and concept
      proc `-`(x: Self): Self
      proc `-`(x, y: Self): Self

konsumlamm commented 2 years ago

I like this option, where the syntax is kept as unexceptional as possible, a lot. On refining concepts, would a similar approach do the trick?
  type
    AdditiveMonoid* = concept
      proc `+`(x, y: Self): Self
      proc zero(T: typedesc[Self]): Self
    AdditiveGroup* = AdditiveMonoid and concept
      proc `-`(x: Self): Self
      proc `-`(x, y: Self): Self

What's wrong with just using of? It is already used for inheritance (which is similar), so it would be even less exceptional:


type
  AdditiveMonoid* = concept
    proc `+`(x, y: Self): Self
    proc zero(T: typedesc[Self]): Self
  AdditiveGroup* = concept of AdditiveMonoid
    proc `-`(x: Self): Self
    proc `-`(x, y: Self): Self

DanielSherlock commented 2 years ago

What's wrong with just using of? It is already used for inheritance (which is similar), so it would be even less exceptional:

I find nothing wrong with of at all; as you say it does echo inheritance for objects, which is nice and consistent. It just seemed like not everyone above liked the idea of adding an inheritance-like feature for concepts "just" to save some repeated declarations (which Nim's templates would be happy to help with).

My thinking with and was that, like or, I'd expect it to be useful in its own right. Also, as soon as you have the ability to check that an instantiated type matches both concepts, we might want to do something like:

  type
    AdditiveMonoid* = concept
      proc `+`(x, y: Self): Self
      proc zero(T: typedesc[Self]): Self
    AdditiveGroupMonoidDiff = concept
      proc `-`(x: Self): Self
      proc `-`(x, y: Self): Self
    AdditiveGroup* = AdditiveMonoid and AdditiveGroupMonoidDiff

...but then why not skip separately declaring the awkwardly-named ...Diff concept, and just write it directly in the and expression? It feels very natural (to me!) to be able to just substitute the name for the definition like that.

It's perhaps also worth mentioning refining more than one concept at once. This is natural (and inevitable) with and. In fact, using again part of @andreaferretti's example above (this time quoting directly), we have:

    Field* = Ring and MultiplicativeGroup

Here, Field refines MultiplicativeMonoid twice: once via Ring and once via MultiplicativeGroup. Of course, the concepts are just checking that the suggested instantiated type satisfies certain conditions - we don't even have default implementations of the given procs - so if it satisfies MultiplicativeGroup once it'll satisfy it again! Perhaps the ease with which we can refine multiple concepts is enough to choose a different syntax than of... though I don't expect that will always be fully apparent immediately?

nim-lang / RFCs

Concepts and type-checking generics #168

Concept redesign

Atoms and containers

Atoms

Containers

each T

either orelse

More examples

system.find

Sortable

Concept matching

Escape hatch

User-defined