Closed haxscramper closed 1 year ago
This is not an addition to RFC - just some ideas that might potentially be useful.
It is not uncommon to see procedure implementation pattern where result
is not explicitly initialized and immediately used to append, set field value etc. It is fine most of the time, but when type definition switches to ref
(e.g. it was just Type = object
and now it is Type = ref object
) this can lead to annoying debugging where you have to figure out all places where implicit initialization happened. This is a rare use case but happens sometimes. `=init`
could potentially make this a non-issue and further diminish distinction between ref
and non-ref
types, which is in line with already supported (experimental) automatic dereferencing.
If ref variable really has to be nil
it might be better to explicitly initialize it as = nil
, otherwise treat it as regular variable and use implicit initialization hook.
More on 'broken type system' - object that have non-trivial initial state (e.g. not just zero-filled memory) are more fragile in cases where implicit initialization is not configurable - you must take care and use dedicated constructors all the time, even in situations like var obj: Obj
.
Another (mostly theoretical) idea is that it might be possible to automatically add finalizers for ref objects if they are created using `=init`
regardless of GC algorithm used. Something like
proc `=init`(v: ref var T) = new(v, final)
How would exceptions be handled?
If you mean exceptions in the `=init`
hook the answer is - I don't think any specific handing is necessary, since value initialization should happen in the same scope as object construction, immediately after var
declaration, which means we either get correctly initialized object (if not exception is raised) or exception is raised - no half-initalized objects if you mean this.
Although I'm not sure if I understand what exact scenario you have in mind - if you could elaborate on your question I might provide better answer if possible.
Just like we require =destroy
to be plain object, we can enforce =init
to not throw. And in C++ AFAIK it's undefined behavior to through in a constructors.
proc `=init`(v: ref var T): {.raises: [].]
And in C++ AFAIK it's undefined behavior to through in a constructors.
Pretty sure it's supported and partially constructed objects are deconstructed properly. Looks super expensive to implement (like everything else in C++ I guess).
-1 from me. First of all construction is very different from destruction, constructors take parameters in most languages and the problem is worse when "size hints" optimizations enter the picture: A size hint should be attached to an object, not to an object's type.
Furthermore the mechanism will soon be misused to avoid the initT
, newT
idiom even though it's strictly less flexible than custom constructor procs, see the "factory" pattern and how C++ got make_shared
and make_unique
even though C++ does have very good support for constructors, there is a lesson to be learned here.
The route forward IMHO is to allow default values inside object declarations with the restriction that the value has to be a compile-time value. For multiple reasons:
id: int = generateUniqueId()
, which is spooky action at a distance. Side effects should not be hidden..requiresInit
remains obvious.Main point is - with constexpr as default values there is no way to execute code when implicit initialization happens. Yes, in overwhelming majority of use cases constexpr is more than enough, but this route completely closes way for non-trivial logic in implicit initialization which might be necessary in some cases.
It is possible to place additional restrictions on =init
procedure prototype, such as .raises[]
and .noSideEffect.
, although latter one makes =init
almost indistinguishable from constexpr fields.
even though it's strictly less flexible than custom constructor procs
I would argue that =init
being explicitly less flexible is a good thing since it prevents misuse.
mechanism will soon be misused to avoid the initT, newT idiom
Again - since there is no support for parameters in =init
I don't see how it would affect existing idioms in most cases. In addition -initT
just looks better, more logical and often used. I don't think =init
will "soon" be misused to avoid initT
.
First of all construction is very different from destruction, constructors take parameters in most languages
Again - this is not about explicit constructors - we already have them (initT
and newT
) and they fit quite nicely into the language. This is only about being able to configure implicit object instantiation.
It prevents
id: int = generateUniqueId()
, which is spooky action at a distance. Side effects should not be hidden.
I'm sorry, but I don't follow how this would prevent it. If you mean let id: int = ...
then no init call should be generated in this case since explicit initalization happened. In case of unique id for each object instance - this would only help, since
type
Obj = object
id: int
proc initObj(): Obj = Obj(id: generateUniqueId())
proc `=init`(obj: var Obj) = obj = initObj()
Allows include Obj
in different structures and not worry about correct implicit initialization of all subfields.
If this is a 'misuse' you were talking about - I think it is necessary to have some way to configure this behavior and cut chain of "if A includes B I must initialize A it correctly using initB
" which basically stretches from initial type to infinity now. With initT
responsibility for correct initialization is pushed to all potential users of a type, again and again, potentially breaking adjacent layers of abstraction (each user of Obj
must be aware that it is important to construct it using initT
and is responsible to making this knowledge available to next abstraction layer (via documentation or .requiresinit.
)). constexpr default values partially mitigate this issue, but by definition (compile-time evaluation) they fail to address cases with generateUniqueId()
.
This is basically the same as .requiresinit.
- yes, it is possible to use, I would argue it is a great tool even, but you are creating responsibility for all potential users.
This problem is quite nicely illustrated by mapIt
's inability to deal with .requiresinit.
types - even though it uses it to get expression type it is still not possible. Using initT
is not possible because it would require passing constructor proc to parameter everywhere necessary. Constexpr types could solve the issue, but as already mentioned this is too restrictive solution.
Again - since there is no support for parameters in =init I don't see how it would affect existing idioms in most cases. In addition -initT just looks better, more logical and often used. I don't think =init will "soon" be misused to avoid initT.
Ok, this wasn't clear to me before, thanks!
But then your proposal is mostly a different syntax for field = defaultValue
(my context here is an object declaration). Syntax aside, there is one difference, you seek to allow arbitary expressions, I really want to restrict it to constant expressions. If we start with the restrictive version, we can always make it less strict in later versions. The same is not so easy for the reverse case: Allow everything and soon enough somebody will rely on it.
This problem is quite nicely illustrated by mapIt's inability to deal with .requiresinit. types - even though it uses it to get expression type it is still not possible.
I think that's a problem that can be solved by special casing typeof
even further.
Yes, exactly. I think that arbitrary expressions might be necessary in some cases, but I agree that it is not possible to make things less strict so starting with constexpr and potentially expanding into =init
is a good solution.
mapIt
is is relatively easy to work around - you can just use (not really pretty though) hack - typeof((var tmp: ref InType; var it {.inject.} = tmp[]; op))
which is not valid at runtime but works fine in most cases.
So can we agree on supporting it in this way:
type
StartWith1 = object
x: int = 1
?
Yes. It covers main concerns about type guarantees invalidation (which is really important) and other complex cases of default initialization would be nice to support, but not right now at least.
@araq it's not entirely clear what proposal led to "Accepted RFC", is it the following:
let a3 = 3
type
Foo = object
x1: int = 1 # ok
x2 = 2 # ok, type inference allowed in initializer
x3: int = a3 # CT error, field initializer must be const
? if so, then +1
it would currently prevent initializers that are ref/ptr/pointer:
type Bar = ref object
b0: int
type Foo = object
b: Bar(b0: 1) # error: initializer is a ref and can't be const
EDIT: this restriction could be lifted by allowing const ref objects, by accepting https://github.com/nim-lang/Nim/pull/15528 (see also https://github.com/nim-lang/RFCs/issues/126#issuecomment-616135556)
this caveat applies:
type Foo = object
x1: cstring = "abc"
var a = Foo()
a.x1[0] = 'A' # SIGBUG
see https://github.com/nim-lang/RFCs/issues/126#issuecomment-616135556 for a more detailed proposal that also covers:
`var a: T` # always equivalent to `var a = default(T)`
# `default(T)` is defined recursively in the obvious way, taking into account default intializers for object types, eg: see example provided there
it's not entirely clear what proposal led to "Accepted RFC", is it the following: ...
Yes.
this caveat applies:
well var a = Foo(x1: "abc")
has the same problem, nothing changed.
The problem I see with @Araq syntax
type
StartWith1 = object
x: int = 1
is that it works only for object initialization. You can't use it with other types like
type
Ranged = range[10 .. 20] # I would like to have 10 as default
BoolTrueDefault = bool # This type of bool should default to 'true'
Constraint[T] =
c: T # When implementation is delegated to another client module,
# default initialization should be too.
First two types here are not distinct
so they should follow regular initialization logic (e.g. false
for bool and 0
for range[10 .. 20]
). Although range example shows how easily it is to break all guarantees with zeroed default values. (var r: range[10 .. 20]; echo r
gives 0
).
Default initialization of distinct types is also an important case to consider, but I just can't see how this can be added in type definition syntax. In objects value for fld: type = val
was just explicitly prohibited so it is easy to just relax the syntax checking, but for distinct
types there is just no place. So for cases like type Hp = distinct range[0 .. 100]
you need to have `=init`(hp: var Hp) = hp = Hp(100)
or something.
Constraint[T]
is just a chain of responsibility "who needs to initialize what" and I don't think it can be solved without =init
or constructor procs.
The problem I see with @Araq syntax Default initialization of distinct types is also an important case to consider, but I just can't see how this can be added in type definition syntax.
that's a separate problem, see my proposal for this here: https://github.com/nim-lang/RFCs/issues/290
can we close this in favor of https://github.com/nim-lang/RFCs/issues/126, given that the title, and description of https://github.com/nim-lang/RFCs/issues/126 is much closer the actual accepted RFC https://github.com/nim-lang/RFCs/issues/252#issuecomment-705113779 (https://github.com/nim-lang/RFCs/issues/126 just needs to be modified to mention that initializers must be evaluatable at CT instead of RT) and then mark the other one as accepted;
alternatively https://github.com/nim-lang/RFCs/issues/126 should be closed to avoid duplicate
construction is very different from destruction
In my opinion it makes sense if the goal is to prevent invalid states. Destruction turns a value from a valid state to an invalid state, initialization turns it from an invalid state to a valid state. Optimizations like noinit
would prevent a call to =init
. Maybe there's a way to turn "zero values" into a compile time construct, while turning "runtime memory zeroing" into a type-bound operation. JS codegen already special cases "runtime memory zeroing" for each mappable type. Not sure how these constructs would interact.
Would it also be possible for this RFC to support tuples? I didn't see an example using them yet:
type
StartWith1 = tuple
x: int = 1
StartWith2 = tuple[y: string = "2"]
Would it also be possible for this RFC to support tuples?
Tuples are different than object
s in that every tuple with the same fields is the same type. So defining a custom initialization hook for a tuple type would affect all tuples of that type, no matter where they're defined. Your example seems more related to https://github.com/nim-lang/RFCs/issues/126, as it doesn't involve defining a =init
hook.
User-defined implicit initialization
This RFC mostly reiterates ideas from #48, #126, #233
Add support for user-defined implicit initialization hook with following prototype:
Is this needed?
Existing proposals
There has been several RFCs related to default initialization/implicit construction for user-defined types.
Existing compiler warnings
Nim compiler already provides two warnings directly related to default initialization, three more related to initialization in general, making total of five initalization-related diagnostics, meaning there is at least some interest in correct initialization behavior
UnsafeSetLen
- "setLen can potentially expand the sequence, but the element type '$1' doesn't have a valid default value"UnsafeDefault
- "The '$1' type doesn't have a valid default value"ProveInit
"Cannot prove that '$1' is initialized. This will become a compile time error in the future.",ProveField
"cannot prove that field '$1' is accessible",ProveIndex
"cannot prove index '$1' is valid",{.requiresinit.}
Separate pragma
{.requiresinit.}
to completely prevent implicit default initialization. Used really infrequently (only 126 times in 1340 packages - approximately 90% of packages I checked haven't used it even once)It is not possible to contain effects of
requiresinit
- once added it affects all code that uses type with annotated fields. It also affects templates that rely ontype Res = typeof((var it {.inject.}; op))
to determine type of expression (right now almost none of the*It
templates can deal with these types).Why this is needed?
Broken type system
As mentioned in these comments by @timotheecour large portion of type safety guarantees is invalidated - enum with offset, ranges now can't really guarantee anything unless explicitly created with
initT
. Any kind of value that has non-zero default requires special attention - it is now your responsibility to make sure this-1
-as-default-value is actually used.{.requiresinit.}
is a solution, but has already mentioned it propagates through whole codebase, requiring far-reaching modifications.NOTE: I personally think that
{.requiresinit.}
is a great way to explicitly declare requirements and enforce them via compiler diagnostics. The only drawback is that it is really viral and has to be worked around in some cases (typeof
pattern can just be written asvar tmp: ref InType; var it {.inject.} = tmp[]; op
).`=destroy`
confusionIt is possible to have specific destruction hook, bound to particular type and you can write
initT
proc for user-defined constructor, but when it comes to default initialization everything is just filled with zero and that's it. It is also possible to completely forbid implicit initialization, but not configure it. I find it rather confusing and counter-intuitive.Large number popular imperative/OOP programming languages provide way to customize default values. Out of all languages mentioned in
nim for X programmers
on wiki onlyC
lacks this feature.constructor
keywordDefault::default()
Other concerns
RFC #126 (Support default values for object properties) suggests implementing default value initialization in form of
Which can be implemented using macro (see forum thread) and it is not necessary to add this into language core. If one wishes they can use macro to automatically declare `=init` hook. It is already possible to do for explicit initialization
initT
procs, but default initialization is not currently configurable.Possible implementation behavior
Similar to how
`=destroy`
is handledIf type does not have user-defined
`=init`
then no injection shall happen. If any of the fields have initialization declared then default initialization in form ofis implicitly declared recursively. If field is has type range or enum for which
low(Enum).int != 0
orlow(range[..]) != 0
then`=init`
is implicitly declared too.Object construction syntax. If field is not initialized by user explicitly and field type has
`=init`
declared field should be implicitly initialized. If forced explicit initialization is necessary then{.requiresinit.}
can be used on object field.NOTE:
{.requiresinit.}
already uses similar logic - if type field cannot be default-initalized then none of the object containing file of this type can be default-initialized too.