Open Araq opened 5 years ago
I would like to understand how to write the concepts in Emmy under this proposal. Currently they look like
type
AdditiveMonoid* = concept x, y, type T
x + y is T
zero(T) is T
AdditiveGroup* = concept x, y, type T
T is AdditiveMonoid
-x is T
x - y is T
MultiplicativeMonoid* = concept x, y, type T
x * y is T
id(T) is T
MultiplicativeGroup* = concept x, y, type T
T is MultiplicativeMonoid
x / y is T
Ring* = concept type T
T is AdditiveGroup
T is MultiplicativeMonoid
EuclideanRing* = concept x, y, type T
T is Ring
x div y is T
x mod y is T
Field* = concept type T
T is Ring
T is MultiplicativeGroup
I guess they would become like this?
type
AdditiveMonoid* = concept
proc `+`(x, y: self): self
proc zero(T: typedesc[self]): self
AdditiveGroup* = concept
self is AdditiveMonoid # will concept refinement be allowed?
proc `-`(x: self): self
proc `-`(x, y: self): self
MultiplicativeMonoid* = concept
proc `*`(x, y: self): self
proc id(T: typedesc[self]): self
MultiplicativeGroup* = concept
self is MultiplicativeMonoid
proc `/`(x, y: self): self
Ring* = AdditiveGroup and MultiplicativeMonoid
EuclideanRing* = concept
self is Ring
proc `div`(x, y: self): self
proc `mod`(x, y: self): self
Field* = Ring and MultiplicativeGroup
I guess they would become like this?
Yes.
self is AdditiveMonoid # will concept refinement be allowed?
I thought about it and we could support it via type AdditiveGroup = concept of AdditiveMonoid
. I don't know if it is important enough that we really need it.
I like the proposal. Can it deal with recursivity? That's important for my needs (https://github.com/mratsim/Arraymancer/issues/299)
In Arraymancer I define a deep learning trainable layer like this but I don't use them at the moment as it's too brittle:
type
Variable[T] = object
value: Tensor[T]
gradient: Tensor[T]
Tensor[T] = object
data: seq[T]
TrainableLayer*[TT] = concept layer
block:
var trainable: false
for field in fields(layer):
trainable = trainable or (field is Variable[TT])
trainable
Conv2DLayer*[TT] = object
weight*: Variable[TT]
bias*: Variable[TT]
LinearLayer*[TT] = object
weight*: Variable[TT]
bias*: Variable[TT]
GRULayer*[TT] = object
W3s0*, W3sN*: Variable[TT]
U3s*: Variable[TT]
bW3s*, bU3s*: Variable[TT]
EmbeddingLayer*[TT] = object
weight*: Variable[TT]
#########
# sanity check
var convolution: Conv2DLayer[float32]
echo convolution is TrainableLayer[float32] # Why false?
#########
# user-defined
type
MyNonLayer = object
foo: int
bar: float32
MyLayer[TT] = object
metadata: MyNonLayer
weight: Variable[TT]
var x: MyNonLayer
var y: MyLayer[float32]
echo x is TrainableLayer[float32]
echo y is TrainableLayer[float32] # Why false
What does that even mean though? "Somewhere in the object type there must be a field of type Variable[T]"? How do you know which one? What if you have more than one?
A) How come I must repeat in every discussion about concepts that any proposal must cover the requirements of associated types and constants? I guess it would have to be:
type Foo = concept
proc ElemType(_: typedesc[self]): typedesc
proc compressedSize(_: typedesc[self]): static[int]
I find it a bit funny how these result in syntax that is outlawed in regular Nim. And hey, why was proc
chosen here? Isn't using func
the recommended way for new Nim code?
B) A "container concept" must be able to infer static parameter values as well. For example
type FixedSizeMatrix[M, N: static[int]; T] = concept
# How do I infer M and N here from a non-generic type?
# In the current design it's done with:
x.Rows == M
x.Cols == N
Please not that Rows
and Cols
can be templates defined over a non-generic type in order to adapt it to the concept requirements:
template Rows(T: type Matrix4x4): int = 4
Having the FixedSizeMatrix
concept above, how do I define a different concept called SquareMatrix
(where the rows and columns must be the same)? There are some algorithms that work only on square matrices.
C) The VTable types are a very important feature in the current concepts design, but I would agree that they can be delivered as a library.
D) I'd really like to have the converter type classes though and these require custom code in semcall/sigmatch.
E) Regarding the main idea here, mainly the type checking of generic functions, I would approach this with a lot of caution. The internal compiler structures can be defined and derived from the current concept syntax. I think it makes a lot of sense to try to experiment with the approach before making any major decisions about removing the existing concepts-related code in the compiler.
In particular, a lot of attention has to be put into code like the following:
https://github.com/status-im/nim-beacon-chain/blob/master/beacon_chain/ssz/types.nim#L85-L101
func isFixedSize*(T0: type): bool {.compileTime.} =
mixin toSszType, enumAllSerializedFields
when T0 is openarray|Option|ref|ptr:
return false
else:
type T = type toSszType(default T0)
when T is BasicType:
return true
elif T is array:
return isFixedSize(ElemType(T))
elif T is object|tuple:
enumAllSerializedFields(T):
when not isFixedSize(FieldType):
return false
return true
Notice how the function accepts an underspecified type, but it discovers a more detailed type using the is
operator. Then the function is able to use operations defined over this more detailed type. This is a very convenient construct enabled by the current lax rules of Nim and we should not lose it.
What does that even mean though? "Somewhere in the object type there must be a field of type Variable[T]"? How do you know which one? What if you have more than one?
I don't care which field, but as long as there is a "Variable[T]" field, the user-defined type constitutes a TrainableLayer indeed. All deep-learning libraries work like that, except that they use dynamic dispatch and inheritance from a Layer class.
How come I must repeat in every discussion about concepts that any proposal must cover the requirements of associated types and constants?
Because it's covered by the proposal as you noticed yourself.
Isn't using func the recommended way for new Nim code?
func
could be allowed but then require func
'ness in the matching process. Though I think it doesn't matter. ;-)
Static T could be done like so:
type FixedSizeMatrix[M, N: static[int]; T] = concept
proc rows(x: self): M
proc cols(x: self): N
I'd really like to have the converter type classes though and these require custom code in semcall/sigmatch.
Well the proposal implies that we have dedicated, custom code for concept matching, moreso than currently.
I think it makes a lot of sense to try to experiment with the approach before making any major decisions about removing the existing concepts-related code in the compiler.
Of course.
In particular, a lot of attention has to be put into code like the following:
That code is not affected, under-constrained generics continue to exist and your isFixedSize
does not use a concept
. I should stress that only real concepts that use the concept
keyword trigger the type-checking for generics. Of course, we could also come up with a different rule when exactly this checking is triggered, we need to tinker with it -- a lot.
proc rows(x: self): M
proc cols(x: self): N
Hrm, you are using a static value as a return type. Seems like an irregularity to me.
My underspecified code shouldn't be ruled out as "non using concepts". I could have easily constrained the input type with a concept such as SszSeriazable
(or any other type class for that matter). It's just a lucky coincidence that this particular example is not doing this. My general remark is that the pattern of using when X is Y
inside generics is a very nice alternative to overloading in many situations and the type checking engine should accommodate it.
Well the eternal question is "is static[int]
a type or a value?". ;-) My RFC assumes it's actually a type.
My underspecified code shouldn't be ruled out as "non using concepts". I could have easily required a concept such as SszSeriazable in place of the auto type there (or any other type class for that matter).
Well no, under this RFC it couldn't "easily require" such a thing. If you cannot be precise about the generic type constraints, don't fake it with an imprecise concept, use an unconstrained generic instead. If you don't like this, fine, but that it is what this RFC says.
Well the eternal question is "is static[int] a type or a value?". ;-) My RFC assumes it's actually a type.
I've never understood why you insist that values bound to static parameters are "types". Can they be passed to functions accepting regular values - yes. When you initialize a constant from a static[int]
parameter, what's the type of the constant - it's int. Where should be the compilation error in the following code:
proc foo(N: static int) =
const N2 = N
proc bar(a: N) = discard
proc baz(a: N2) = discard
For me, the "eternal" question has a very definitive answer. The static parameters are values having tyStatic
as a modifier (just like the var
parameters are values having tyVar
as a modifier).
Here is some more fun. If we continue with the "static parameters are types" thinking, we reach the following:
type 3DTransform = concept
self is FixedSizeMatrix
proc cols(x: self): 4
proc rows(x: self): 4
Here is some more fun...
I'm not sure you're asking the right question. You are thinking about whether these alternative concepts can support what the current concepts can. But the real question is why does it matter to infer the value 4
in order to type-check a generic? And how often does it matter to be able to do this?
I don't care which field, but as long as there is a "Variable[T]" field, the user-defined type constitutes a TrainableLayer indeed. All deep-learning libraries work like that, except that they use dynamic dispatch and inheritance from a Layer class.
Why not use this:
type
TrainableLayer[T] = concept
proc variableT(x: self): Variable[T]
# implementation
type
MyLayer = object
f: Variable[int]
template variableT(x: MyLayer): Variable[int] = x.f
Because there are many variables, not just one. One actually wants to express a (fixed shape) tree whose leaves are variables
Ok, how about:
type
TrainableLayer = concept
iterator variables(x: self): Variable[each T]
# implementation
type
MyLayer = object
f: Variable[int]
iterator variables(x: MyLayer): Variable[int] = yield x.f
I guess that could work, but I leave the final word to @mratsim , who has actually tried to put the idea into practice
But the real question is why does it matter to infer the value 4 in order to type-check a generic? And how often does it matter to be able to do this?
This example was not inferring 4, it was requiring it. A 3D transform is an affine transformation matrix with dimensions 4 by 4. There are various algorithms that work only on such matrices, so I would say it's a perfect example for a requirement that a generic 3G graphics library might have.
It needs to support a variadic number of layers like
type
Conv2DLayer*[TT] = object
weight*: Variable[TT]
bias*: Variable[TT]
LinearLayer*[TT] = object
weight*: Variable[TT]
bias*: Variable[TT]
GRULayer*[TT] = object
W3s0*, W3sN*: Variable[TT]
U3s*: Variable[TT]
bW3s*, bU3s*: Variable[TT]
EmbeddingLayer*[TT] = object
weight*: Variable[TT]
Residual NN layers: https://www.quora.com/How-does-deep-residual-learning-work
type
ResidualBlock*[TT] = object
conv1*: Conv2DLayer[TT]
conv2*: Conv2DLayer[TT]
The concept should be able to match with TrainableLayer[TT] provided with this type definition. Anything that requires creating a proc or an additional level of indirection at the type level is a lot of friction compared to Python.
At most we can have this:
type
ResidualBlock*[TT] = object
conv1*{.trainable.}: Conv2DLayer[TT]
conv2*{.trainable.}: Conv2DLayer[TT]
but I'm not sure how to make a concept look for a pragma.
It doesn't sound like we're gonna be able to type-check "somewhere in there Variable[TT] fields will be processed somehow" anytime soon. That doesn't mean the idea of type-checking generics with concepts is pointless... ;-)
Nice to see that this is basically what I proposed in #167 + each
, self
, either orelse
:). Of course, you've obviously thought far more about the semantics.
Personally I would still do away with the self
, mainly because it's yet another built-in type and reusing the concept name makes more sense to me:
type
Comparable = concept # no T, an atom
proc cmp(a, b: Comparable): int
ToStringable = concept
proc `$`(a: ToStringable): string
Hashable = concept
proc hash(x: Hashable): int
proc `==`(x, y: Hashable): bool
Swapable = concept
proc swap(x, y: var Swapable)
To be honest I'm really after simplicity here, to me introducing any new custom syntax is going too far. Because of this I would simply omit each
as well (it's a very specific case and I wonder how many other languages actually support it, Rust at least doesn't), why don't we start with the basics first?
For the elseor
case I can see this being possible and quite intuitive too:
type
Appendable = concept
proc add(s: var Appendable, elem: Appendable)
ToStringable = concept
proc `$`(x: ToStringable): string
Quotable = Appendable or ToStringable
But all in all, I still prefer this over the current concept syntax even if it does include these additional elements.
First of all, nice to see this work being done. At first sight it looks clean. Regarding the either/orelse, I don't think it should be supported, at least not because of the gives use case. Generally it is much better to just require one operation, in this case it would be add
. Then, when a concept doesn't match because of a missing add
, it is trivial to add this overload to the type. It is just for historic reasons that this is not done in the standard library.
type
Quotable = concept
proc add(s: var string; elem: self)
proc addQuoted[T: Quotable](s: var string; x: T) =
s.add(x)
type
MyType = object
var str: string
var mt: MyType
proc add(s: var string; elem: MyType) =
s.add $elem
str.addQuoted(mt) # without the previous declaration: error
If we zoom out a bit, a concept can represents two distinct set of requirements:
1) Supported syntax requirements This is a set of required expressions that the compiler must be able to resolve in the context of a generic function after an overload has been selected. If we know this set, we can type check the definition of the generic function before it's instantiated.
2) Type predicates These are arbitrary facts that must be true for the input types before we consider them eligible for overload selection.
This proposal focuses entirely on the first type of requirements while ignoring the second type. By doing this, you only lose expressivity and nothing is gained in return. Gathering the set of syntax requirements is all you need to type check generics and it's a false assumption to believe that this is somehow hindered by the current approach to concepts (the set of requirements is well defined - it's exactly what would be needed in order to compile a generic function with the same body as the concept).
If one really insists on using more declarative syntax, I'd suggest at least the following additions:
of
as a mechanism for refining concepts==
as a mechanism for inferring static parameter valuesSome optional points to consider:
callable
to reduce the proc/template confusionwhen defined(foo): ...
)By doing this, you only lose expressivity and nothing is gained in return.
Er, it allows us to avoid the guesswork we do in the "generic prepass" with its ideas of "bind" vs "mixin" symbols. That is a gain. Also, proper type-checking for generics is done by C#, Scala, Rust, etc -- the C++'s template solution so far only has been copied by Nim and D, both languages which lacked an expert on type theory for their initial designs.
These are arbitrary facts that must be true for the input types before we consider them eligible for overload selection.
Yeah, well, "arbitrary facts" are dangerous. And as I said, if the concept cannot describe the type precisely, there is always the under-specified generic T
that can be used.
If one really insists on using more declarative syntax, I'd suggest at least the following additions: ...
Yeah, agreed, especially inferring static values needs to get special support.
By "facts", I meant the predicates that the existing concepts support which doesn't necessarily concern the availability of certain overloads. Requiring a square matrix is an example for this. I meant that by not supporting such predicates you only lose expressivity.
Er, it allows us to avoid the guesswork we do in the "generic prepass" with its ideas of "bind" vs "mixin" symbols. That is a gain. Also, proper type-checking for generics is done by C#, Scala, Rust, etc -- the C++'s template solution so far only has been copied by Nim and D, both languages which lacked an expert on type theory for their initial designs.
I also tried to explain that the desired type checking is possible with or without such predicates. It's also possible with the current concepts, so this message is a bit of a straw man argument.
Requiring a square matrix is an example for this. I meant that by not supporting such predicates you only lose expressivity.
My standard techniques to do this is to create a distinct type with private fields, and ensure that the only way to construct it is via some function that checks the predicate. Some sugar for this would go a long way
As someone who isn't familiar with Nim, what does each T
mean?
Why is each T
marked in one signature and in the other not?
Further, is there any chance of nominal concepts ala type implements concept
? I could imagine that it is easier for the user to know when a type implements some concept and also for the compiler to know that the type is compatible to the concept after typechecking the implementation once instead to do it every time at the call site.
I like the proposal except for these issues:
either orelse
is yet another conditional syntax beside if
and when
.proc
with the same name is not always what's intended.These could be solved with some modifications:
true
at compile time for a type to match.implemented
macro which takes the block of proc
and iterator
declarations which is the concept body in the original proposal as an argument. It returns true if a type matches the argument as specified in the OP, with the following exception:proc
declaration of the same name.Code example:
type
Dictionary[K, V] = concept
implemented:
proc `[]`(a: self; key: K): V
proc `[]=`(a: var self; key: K; value: V)
# either orelse is no longer needed
implemented( proc capacity(d: self): int ) or
implemented( proc size(d: self): int ) or self.hasField(size, int)
# any other compile-time test, if necessary even
compiles(someExpression)
# ... or some other awesome test that doesn't even exist now.
Less compact than the OP, but with less non-obvious rules and much more flexible.
either orelse is yet another conditional syntax beside if and when.
Well it's not the same as when
. Using the same keywords for everything is faking simplicity.
It seems to limit itself to defining API requirements and looks like an interface, but doesn't want to be mistaken for one.
No idea what that means. How else would you limit generic types?
Not differentiating between a publicly accessible object field and a unary proc with the same name is not always what's intended.
Why not?
Less compact than the OP, but with less non-obvious rules and much more flexible.
So effectively you want the old concepts that we already have. But I gave you a thumbs up, it's growing on me.
Well it's not the same as when. Using the same keywords for everything is faking simplicity.
Valid point. What I meant was: it would be better to avoid a situation where another conditional construct is even needed.
From the OP:
Syntactically a concept consists of a list of proc and iterator declarations.
As I understand it, this would limit concepts to type matching by API (independently of genericity). I would like to be able to specify criteria like "name of the type starts with an 'X'" or something wild I cannot even think of now.
Not differentiating between a publicly accessible object field and a unary proc with the same name is not always what's intended.
Why not?
Imagine we want to use procs from the API of a type as callbacks, e.g. Field access is not what we want then. Plus, it would be a non-obvious thing people "just have to know".
So effectively you want the old concepts that we already have.
Flexibility-wise, yes. Syntactically, I like your proposal better. I'm sure there were good reasons for making concept syntax as it is, but using types for parameter values like in someProc(string) is int
just feels wrong to me now.
Another difference between existing concepts and my proposal is the test for the statements in a concept body: right now it is "it compiles and if it has a bool
value, that value is true
", so I never really know whether a statement doesn't compile because I screwed it up or because the test works as intended. My proposal drops the "it compiles" part, where needed I want to use compiles
explicitly.
Imagine we want to use procs from the API of a type as callbacks, e.g. Field access is not what we want then. Plus, it would be a non-obvious thing people "just have to know".
That's a valid point.
My proposal drops the "it compiles" part, where needed I want to use compiles explicitly.
Yes, it's nice.
Another difference between existing concepts and my proposal is the test for the statements in a concept body: right now it is "it compiles and if it has a bool value, that value is true", so I never really know whether a statement doesn't compile because I screwed it up or because the test works as intended. My proposal drops the "it compiles" part, where needed I want to use compiles explicitly.
Any concepts design needs to consider how the user is going to get diagnostics for non-matching concepts. The current implementation hasn't done enough to make this easy for the user. My long term vision is that the IDE itself will offer an interactive version of the explain
feature that will be able to tell you precisely why a concept wasn't matched (you place your cursor over a particular call-site and you ask nimsuggest to provide an explanation).
I don't see how introducing an explicit compiles
section helps. Please describe the error message that the user will be seeing for some specific non-matching concepts and then try to answer why the same message cannot be delivered with the current syntax.
But I like the fact that we hear more voices supporting the importance of the predicates within the concepts. This was one my main points in the discussions above.
My long term vision is that the IDE itself will offer an interactive version of the explain feature that will be able to tell you precisely why a concept wasn't matched
No doubt explain
can deliver any amount of information about why a match failed. My problem is that the way how this info is provided breaks basic compiler behavior: if I misspell a proc
name, I want the compiler to abort with an error, no matter whether the typo occurred in a concept body or elsewhere.
As for IDE support: I'm all for it, but don't make it a requirement for handling basic compiler behavior like error reporting. I wouldn't want to deal with special cases in build tools / CI runs because the compiler doesn't error out on a simple typo sometimes (i.e. when the typo occurs in a concept body).
I don't see how introducing an explicit compiles section helps.
It doesn't. Using compiles
explicitly is not a goal in my proposal, it's a consequence of the relaxed matching rule for types against concept body expressions, which is what I really want.
Hell, if I'm honest I want a concept to be nothing more than a compile-time filter proc
the compiler uses for type matching. The downside would be that we would need and
operators between the body statements, the upside would be that your concept-local types and constants could occur naturally in the body of that proc
.
But I like the fact that we hear more voices supporting the importance of the predicates within the concepts.
Yes, a narrower concept definition could close the door on powerful stuff we don't even think of now.
There have been proposals for defining the concepts as plain old predicate functions applied to the matched type, but this misses another important point. In generic concepts such Container[T]
or Enumerable[T]
, the concept body is used to infer the T
parameter:
proc max[T](it: Enumerable[T]): T =
var max = low(T)
for val in items(it):
...
This form of inference can introduce relations and dependencies between multiple required procs in the interface. It's hard and awkward to express this with a simple predicate function.
In any concept proposal, typos are hard to report automatically (unless you go for heuristics such as detecting the typos with metrics such as Levenshtein distance). The simple reason for this is that it's appropriate to report errors only when there wasn't any match at the call site. Since it's perfectly normal for the concepts to not match, all failures have to be ignored when we have a match. The concepts are often used to describe more specific overloads and thus a more general matching overload can hide the typos in a more specific concept. I claim that there is no difference between the more declarative style suggested in this proposal and the current syntax when it comes to producing error messages.
The IDE can help here by offering mouse-over tooltips revealing which overload is selected at a particular call site (or though the go-to-definition feature). Some heuristic warnings are also possible - e.g. warn about a concept that was nearly matched, or an overload that was never matched.
@zah Read up on some of the other concept related issues. In addition to your comment that gives:
in a concept predicate, the difference between "typo in a proc name which should make the compiler error out" and "proc which doesn't exist in a specific calling context, but might well in another" exists only in my head and is undetectable to the compiler.
these discussions seem to regularly evolve into you having to defend the requirements of a wider concept definition (WC) against those who favor a more narrow concept definition (NC) and people like me who have a wish list of concept features but no deep understanding of Nims type resolution algorithm (sry for that).
we're in a vicious circle right now: fully functional concepts would benefit the source code of the compiler itself, I imagine by making it more concise and therefore more maintainable / extensible, but it is already too complex to fix the existing WC implementation. We have a number of smart people here, and the fact that none of them really touched the many concept related issues makes me doubt that this is merely a matter of throwing dev time at it. A bit scary.
So, could an NC like the one proposed in the OP, using a different name than "concept",
There is another way to deal with typos that I forgot to mention. For a lack of better name, I call it "explicit membership check". After you define your type and its operations, you can say the equivalent of:
assert MyType is Enumerable[int] # actual syntax may be slightly different
This will force the compiler into the error-producing mode and any typos will be properly reported.
The existing implementation of concepts is not complex to fix. It only requires someone to put the hours into working on it. Ideally, this someone would be me, but some external factors beyond my control currently prevent me from doing this. I'm deeply sorry for this.
type
Quotable = concept
either:
proc `$`(x: self): string
orelse:
proc add(s: var string; elem: self)
proc addQuoted[T: Quotable](s: var string; x: T) =
when compiles(s.add(x)):
s.add(x)
else:
s.add($x)
I'm not a fan of either/orelse
.
Five lines of code turned into eleven. What did we gain from this?
FWIW here is a prototype of this implemented
macro outlined here https://github.com/nim-lang/RFCs/issues/168#issuecomment-593410686:
macro implemented(d: untyped): untyped =
proc implementedImpl(d: NimNode): NimNode =
result = nnkCall.newNimNode()
result.add d[0]
for i in 1..<d[3].len:
for j in 0..<d[3][i].len - 2:
result.add d[3][i][^2]
if d[3][0].kind != nnkEmpty:
result = nnkInfix.newTree(
ident"is",
result,
d[3][0]
)
case d.kind
of nnkStmtList, nnkStmtListExpr:
result = d
for i, p in d:
result[i] = implementedImpl(p)
of nnkProcDef:
result = implementedImpl(d)
else: assert false
expandMacros:
type C = concept self
implemented:
proc cmp(a, b: self): int
proc something(a: self)
I think self
should be avoid, or at least can be renamed in syntax...
or it will cause confuse when people use a tuple[self: X]
or proc(self: X)
(consider self
is not a keyword or reserved, so it can be freely used in anywhere that expected a nnkIdent)
(and I don't think add more and more words into keyword list is a good idea)
@codehz FWIW https://github.com/nim-lang/Nim/pull/15251/ uses Self
instead of self
so your issue won't really be a problem. (the compiler also checks that the Self
is the one it expects to see - https://github.com/nim-lang/Nim/blob/devel/compiler/concepts.nim#L36)
I guess they would become like this?
Yes.
self is AdditiveMonoid # will concept refinement be allowed?
I thought about it and we could support it via
type AdditiveGroup = concept of AdditiveMonoid
. I don't know if it is important enough that we really need it.
Would Emmy
style concepts be possible without some kind of concept refinement?
Is there more to concept refinement than duplicating the concepts constraints (aka the concept's body) in a different concept?
I don't understand Nim well enough yet to answer that question unfortunately.
If refinement is not available though I'd like to avoid the situation where a "chain" of refined concepts has the same block of constraints copy-pasted in at every member of the chain.
For example, in a library like Emmy I could have concepts for Magma, Semigroup, Monoid, Group, AbelianGroup, Ring, CommutativeRing and Field, each of which is refinement of the previous concept. The Magma constraints would be repeated seven times, the Semigroup constraints six times etc across the chain, with a corresponding increase in maintenance and readability issues.
I'm not sure if that's how the new concepts proposal would work in practice though?
Copy and paste can always be avoided via Nim's template mechanism anyway.
Why do copy & pasting at all, doesn't it suffice to point just to the constraints of the concepts that need to be satisfied/implemented too?
Question:
Is a concept a compiled time only abstraction or could we use them for boxing too, i.e. we know there is a type satisfying a concept, but we don't know what it is, and we may want to change this type at runtime. Also, concept refinements would imply a subtype relationship.
Does Self
refer to the implementing type of concept or to the concept type itself. I find the former better in builder patterns and operations requiring the same underlying type as it is the case for type safe equality.
I agree that refinement is not strictly needed, and copy/paste can be avoided via templates. I guess one could also implement refinement via a type macro, that just substitutes the body of the refined concept in the right spot. Still, conceptually is not bad to have it, if it can be supported with low cost.
This is just a collection of my thoughts about concepts so far:
I find the argument that refinement is not needed, because it can be done with templates a bit weird. Yes, you could copy/paste all the declarations by defining them inside a template, but that would look something like
template common() = ...
type
A = concept
common()
B = concept
common()
...
which isn't very ergonomic imo and it should be really easy to implement in the compiler (just leave the copy/paste to the compiler, or point to the parent concept).
Another thing that I find a bit weird is that proc
/func
means "any routine" in the context of a concept. I'd prefer some special syntax (routine test(a: int)
? test(a: int)
?) or that any routine declaration (i.e. func
, proc
, template
, macro
etc.) is allowed and they each mean "this or something less powerful", so that func
only accepts funcs, proc
accepts procs and funcs, template
accepts templates, procs and funcs, and so on.
Regarding associated types and constants (as they exist in Rust, for example), I'd love to see them, but I'm not sure how that's supposed to work, since items aren't bound to types in Nim (refs #380). I tried to simulate them using a proc that takes a typedesc[Self]
parameter, but that doesn't seem to work currently:
type
Associated = concept
proc T(self: typedesc[Self]): typedesc
proc A(self: typedesc[Self]): static[int]
proc T(self: typedesc[int]): typedesc = bool
proc A(self: typedesc[int]): static[int] = 42
static: assert int is Associated # Error
Finally, I think when
should be supported in a concept body, like it is possible in other type definitions (object
s, enum
s, ...).
@dom96 on 10 Sep 2019:
For the
elseor
case I can see this being possible and quite intuitive too:type Appendable = concept proc add(s: var Appendable, elem: Appendable) ToStringable = concept proc `$`(x: ToStringable): string Quotable = Appendable or ToStringable
I like this option, where the syntax is kept as unexceptional as possible, a lot. On refining concepts, would a similar approach do the trick?
(modifying the example given by @andreaferretti)
type AdditiveMonoid* = concept proc `+`(x, y: Self): Self proc zero(T: typedesc[Self]): Self AdditiveGroup* = AdditiveMonoid and concept proc `-`(x: Self): Self proc `-`(x, y: Self): Self
I like this option, where the syntax is kept as unexceptional as possible, a lot. On refining concepts, would a similar approach do the trick?
type AdditiveMonoid* = concept proc `+`(x, y: Self): Self proc zero(T: typedesc[Self]): Self AdditiveGroup* = AdditiveMonoid and concept proc `-`(x: Self): Self proc `-`(x, y: Self): Self
What's wrong with just using of
? It is already used for inheritance (which is similar), so it would be even less exceptional:
type
AdditiveMonoid* = concept
proc `+`(x, y: Self): Self
proc zero(T: typedesc[Self]): Self
AdditiveGroup* = concept of AdditiveMonoid
proc `-`(x: Self): Self
proc `-`(x, y: Self): Self
What's wrong with just using
of
? It is already used for inheritance (which is similar), so it would be even less exceptional:
I find nothing wrong with of
at all; as you say it does echo inheritance for objects, which is nice and consistent. It just seemed like not everyone above liked the idea of adding an inheritance-like feature for concepts "just" to save some repeated declarations (which Nim's templates would be happy to help with).
My thinking with and
was that, like or
, I'd expect it to be useful in its own right. Also, as soon as you have the ability to check that an instantiated type matches both concepts, we might want to do something like:
type AdditiveMonoid* = concept proc `+`(x, y: Self): Self proc zero(T: typedesc[Self]): Self AdditiveGroupMonoidDiff = concept proc `-`(x: Self): Self proc `-`(x, y: Self): Self AdditiveGroup* = AdditiveMonoid and AdditiveGroupMonoidDiff
...but then why not skip separately declaring the awkwardly-named ...Diff
concept, and just write it directly in the and
expression? It feels very natural (to me!) to be able to just substitute the name for the definition like that.
It's perhaps also worth mentioning refining more than one concept at once. This is natural (and inevitable) with and
. In fact, using again part of @andreaferretti's example above (this time quoting directly), we have:
Field* = Ring and MultiplicativeGroup
Here, Field
refines MultiplicativeMonoid
twice: once via Ring
and once via MultiplicativeGroup
. Of course, the concepts are just checking that the suggested instantiated type satisfies certain conditions - we don't even have default implementations of the given procs - so if it satisfies MultiplicativeGroup
once it'll satisfy it again! Perhaps the ease with which we can refine multiple concepts is enough to choose a different syntax than of
... though I don't expect that will always be fully apparent immediately?
Concept redesign
Goals:
echo
orlog
statement does not require a far reaching type constraint addition.concept
's implementation onsystem.compiles
which is under-specified in Nim, very tied to the current implementation, and finally, slow.Non goals:
T
underspecified.cdecl
calling convention!")enableif
for this without bloating the concept's design.Atoms and containers
Concepts come in two forms: Atoms and containers. A container is a generic concept like
Iterable[T]
, an atom always lacks any kind of generic parameter (as inComparable
).Syntactically a concept consists of a list of
proc
anditerator
declarations. There are 3 syntatic additions:Self
is a builtin type within the concept's body stands for the current concept.each
is used to introduce a generic parameterT
within the concept's body that is not listed within the concept's generic parameter list.either orelse
is used to provide basic support for optional procs within a concept.We will see how these are used in the examples.
Atoms
Self
stands for the currently defined concept itself. It is used to avoid a recursion,proc cmp(a, b: Comparable): int
is invalid.Containers
A container has at least one generic parameter (most often called
T
). The first syntactic usage of the generic parameter specifies how to infer and bindT
. Other usages ofT
are then checked to match what it was bound to.Nothing interesting happens when we use multiple generic parameters:
The usual ": Constraint" syntax can be used to add generic constraints to the involved generic parameters:
each T
Note:
each T
is currently not implemented.each T
allows to introduce generic parameters that are not part of a concept's generic parameter list. It is furthermore a special case to allow for the common "every field has to fulfill property P" scenario:either orelse
Note:
either orelse
is currently not implemented.In generic code it's often desirable to specialize the code in an ad-hoc manner.
system.addQuoted
is an example of this:If we want to describe
T
with a concept we need some way to describe optional aspects.either orelse
can be used:More examples
system.find
It's straight-forward:
Sortable
Note that a declaration like
is possible but unwise. The reason is that
Indexable
either contains too many procs we don't need or accessors that are slightly off as they don't offer the right kind of mutability access.Here is the proper definition:
Concept matching
A type
T
matches a conceptC
if every proc and iterator headerH
ofC
matches an entityE
in the current scope.The matching process is forgiving:
If
H
is aproc
,E
can be a proc, a func, a method, a template, a converter or a macro.E
can have more parameters thanH
as long as these parameters have default values. The parameter names do not have to match.If
H
has the formproc p(x: Self): T
thenE
can be a public object field of namep
and of typeT
.If
H
is an iterator,E
must be an iterator too, butE
's parameter names do not have to match and it can have additional default parameters.Escape hatch
Generic routines that have at least one concept parameter are type-checked at declaration time. To disable type-checking in certain code sections an
untyped
block can be used:EDITED 2021/03/09
self
was renamed toSelf
and is what the experimental implementation uses.