Type checking for nil dereference: nilable and not-nilable types
Abstract
Check nilability of ref-like types and make dereferencing safer based on flow typing and not nil annottions.
Note
Note: the original version of this RFC was similar but it made not nil a default: a version of it can be found in the file of the PR https://github.com/nim-lang/RFCs/pull/169 and please also take a look at history of the RFC section. All of the github comments and reactions there are for the first RFC version.
Motivation
Null errors are ones of the most common ones.
Modern languages strive to either use Options/similar types or to at least make handling null values safer.
Compile time checking is useful to prevent null dereference before running the program.
In Nim we can try to do that in many situations for ref-like types. However we need to preserve compatibility
for some time, so we can start with having warnings which can be turned into errors by the user.
Description
Nim already has some not nil support, but the plain is to use here a different version of it.
Ref types are nil by default: this means they can include nil by default.
They can be annotated to forbid including nil with the not nil annotation.
We use flow-sensitive typing to check nilability.
type
NilableObject = ref object
a: int
Object = NilableObject not nil
Proc = (proc (x, y: int))
proc p(x: Object) =
echo x.a # ensured to dereference without an error
# compiler catches this:
p(nil)
# and also this:
var x: NilableObject
p(x)
If a type can include nil as a valid value, dereferencing values of the type
is checked by the compiler: if a value which might be nil is derefenced, this produces a warning by default, you can turn this into an error using the compiler options. TODO
You can still turn off nil checking on function/module level by using a {.nilCheck: off}. pragma.
If a type is nilable, you should dereference its values only after a isNil or equivalent check, e.g.:
proc p(x: NilableObject) =
if not x.isNil:
echo x.a
# equivalent
if x != nil:
echo x.a
p(x)
Safe dereferencing can be done only on certain locations:
var local variables
let variables
arguments
Dereferencing operations: look at [Reference and pointer types], for procedures: calling is such an operation.
It's enough to ensure that a value is not nil in a certain branch, to dereference it safely there: the language recognizes such checks
in if, while, case, and, or
e.g.
not nilable.isNil and nilable.a > 0
is fine.
case can be used as well
case a.isNil:
of true:
echo a.a # error
of false:
echo 0
However, certain constructs invalidate the value not-nil-ness.
calls to functions where the location we check is passed by var
reassignments of the checked location
if not nilable.isNil:
nilable.a = 5 # OK
var other = 7 # OK
echo nilable.a # OK
call() # maybe sets nilable to `nil`?
echo nilable.a # warning/error: `nilable` might be nil
If we do a check in a e.g. if, the other branches (e.g. else) assume the opposite fact about the nilability of a value.
if a.isNil:
echo 0
else: # a is not nilable
echo a.a
Additional check is that the return value is also not nil, if that's expected by the return type
proc p(a: Nilable): Nilable not nil =
if not a.isNil:
result = a # OK
result = a # warning/error
Early return after nil check is ok: the behavior is the same as if the remaining code was in else
if a.isNil:
return
a[] # ok
When two branches "join", a location is still safe to dererence, if it was not-nilable in the end of both branches, e.g.
if a.isNil:
a = Object()
else:
echo a.a
# here a is safe to dereference
Dereferencing compound expressions
We should try to support more expressions: maybe most common are dot expressions , a.b.c type of expressions as well.
if not a.b.c.isNil:
echo a.b.c.field # OK
We need to ensure the expression hasn't been changed meanwhile: so we should detect invalidation of
the non-nilability on
reassignment to any of the subexpressions
calls (except func calls which don't affect it? can we ensure no aliasing?) TODO
Supporting bracket expressions a[b] would be useful as well with similar limitations.
Initialization of non nilable pointers
The compiler ensures that every code path initializes variables which contain
non nilable pointers.
if a value of a type can't be implicitly initialized, it should be constructed directly with explicitly filling the required not nil fields
no implicit initialization for object types is lifted from their fields
the compiler proves that each path in a proc sets result if there is not nil return type
It seems this is already done in the compiler using tfRequiresInit ?
Not nil ref default values and usage in collections
Default values of ref T not nil is nil. This can lead to holes in the type safety: that's why we still have to handle some cases.
seq[T] where T is ref and not nil are an interesing edge case: they are supported with some limitations.
They can be created with only some overloads of newSeq:
newSeq(length): default for ref T not nil returns nil, so the programmer is responsible to fill correctly the sequence. However this should produce a warning (which can be turned off).
However this should be used only in edge cases.
There is special treatment of setLen related functions as well: one can use shrink in all cases.
However one can use grow similarly to newSeq :
grow(length): calls default: expects that the programmer fills the new elements with non nil values manually.
Similar logic applies to arrays, tables and other collections that can use default values.
newSeq , setLen, grow should produce a warning for ret T not nil which can be turned off
default for ref T not nil should also produce a warning, and maybe some way to infer that for the instantiations with good error stack would be best
Implementation
We plan on using the AST for now. We should keep track of nilability(possibly with additional state) and "dependencies" of compound expressions.
Backward incompatibility
We don't show errors by default to be backward compatible.
However this might be a good idea in Nim 2.0 .
We should use warnings which can be disabled. Old programs with those warnings should keep compiling by default.
Future development
Eventually we can extend this mechanism to types represented by finite sets : e.g. enums and base variant support on the enum check. I might write a RFC for that, but it's possible that such a feature would use the z3 engine support or at least the requires pragmas.
History of the RFC
A version of the first variant of the RFC issue included this text:
WIP for nilability spec for Nim : typechecking ref-s: please read the spec in the nilcheck.rst file of the pull request(it's rendered better at https://github.com/alehander42/RFCs/blob/master/RFCs/nilcheck.rst ), and comment somewhere here on the proposed syntax/semantics
ideas that probably won't make it through: https://github.com/alehander42/Nim/blob/nilcheck-spec/doc/nilcheck_maybe.rst
based on irc conversations with Araq (and a previous AST-based implementation PR inspired by conversations with zah and Herb Sutter's C++ related Lifetime paper, this is much more limited tho: just focused on nilability and ref types)
Based on irc conversations with Araq, conversations with zah and Herb Sutter's C++ related Lifetime paper(this is much more limited tho: just focused on nilability and ref types), and experience with this feature in other languages (e.g. Typescript, C#, Zig and others).
Type checking for nil dereference: nilable and not-nilable types
Abstract
Check nilability of ref-like types and make dereferencing safer based on flow typing and
not nil
annottions.Note
Note: the original version of this RFC was similar but it made
not nil
a default: a version of it can be found in the file of the PR https://github.com/nim-lang/RFCs/pull/169 and please also take a look athistory of the RFC
section. All of the github comments and reactions there are for the first RFC version.Motivation
Null errors are ones of the most common ones. Modern languages strive to either use Options/similar types or to at least make handling null values safer. Compile time checking is useful to prevent null dereference before running the program.
In Nim we can try to do that in many situations for ref-like types. However we need to preserve compatibility for some time, so we can start with having warnings which can be turned into errors by the user.
Description
Nim already has some
not nil
support, but the plain is to use here a different version of it.Ref types are
nil
by default: this means they can includenil
by default. They can be annotated to forbid includingnil
with thenot nil
annotation.We use flow-sensitive typing to check nilability.
If a type can include
nil
as a valid value, dereferencing values of the type is checked by the compiler: if a value which might be nil is derefenced, this produces a warning by default, you can turn this into an error using the compiler options. TODOYou can still turn off nil checking on function/module level by using a
{.nilCheck: off}.
pragma.If a type is nilable, you should dereference its values only after a
isNil
or equivalent check, e.g.:Safe dereferencing can be done only on certain locations:
var
local variableslet
variablesDereferencing operations: look at [Reference and pointer types], for procedures: calling is such an operation.
It's enough to ensure that a value is not nil in a certain branch, to dereference it safely there: the language recognizes such checks in
if
,while
,case
,and
,or
e.g.
is fine.
case
can be used as wellHowever, certain constructs invalidate the value
not-nil
-ness.If we do a check in a e.g.
if
, the other branches (e.g.else
) assume the opposite fact about the nilability of a value.Additional check is that the return value is also
not nil
, if that's expected by the return typeEarly return after nil check is ok: the behavior is the same as if the remaining code was in else
When two branches "join", a location is still safe to dererence, if it was not-nilable in the end of both branches, e.g.
Dereferencing compound expressions
We should try to support more expressions: maybe most common are dot expressions ,
a.b.c
type of expressions as well.We need to ensure the expression hasn't been changed meanwhile: so we should detect invalidation of the non-nilability on
Supporting bracket expressions
a[b]
would be useful as well with similar limitations.Initialization of non nilable pointers
The compiler ensures that every code path initializes variables which contain non nilable pointers.
not nil
fieldsno implicit initialization
for object types is lifted from their fieldsnot nil
return typeIt seems this is already done in the compiler using
tfRequiresInit
?Not nil ref default values and usage in collections
Default values of
ref T not nil
isnil
. This can lead to holes in the type safety: that's why we still have to handle some cases.seq[T]
whereT
isref
andnot nil
are an interesing edge case: they are supported with some limitations.They can be created with only some overloads of
newSeq
:newSeq(length)
:default
forref T not nil
returnsnil
, so the programmer is responsible to fill correctly the sequence. However this should produce a warning (which can be turned off).However this should be used only in edge cases.
There is special treatment of
setLen
related functions as well: one can useshrink
in all cases. However one can usegrow
similarly tonewSeq
:grow(length)
: callsdefault
: expects that the programmer fills the new elements with non nil values manually.Similar logic applies to arrays, tables and other collections that can use default values.
newSeq
,setLen
,grow
should produce a warning forret T not nil
which can be turned offdefault
forref T not nil
should also produce a warning, and maybe some way to infer that for the instantiations with good error stack would be bestImplementation
We plan on using the AST for now. We should keep track of nilability(possibly with additional state) and "dependencies" of compound expressions.
Backward incompatibility
We don't show errors by default to be backward compatible. However this might be a good idea in Nim 2.0 .
We should use warnings which can be disabled. Old programs with those warnings should keep compiling by default.
Future development
Eventually we can extend this mechanism to types represented by finite sets : e.g. enums and base variant support on the enum check. I might write a RFC for that, but it's possible that such a feature would use the z3 engine support or at least the
requires
pragmas.History of the RFC
A version of the first variant of the RFC issue included this text:
and the text of the RFC as a file in the PR https://github.com/nim-lang/RFCs/pull/169
Credits
Based on irc conversations with Araq, conversations with zah and Herb Sutter's C++ related Lifetime paper(this is much more limited tho: just focused on nilability and ref types), and experience with this feature in other languages (e.g. Typescript, C#, Zig and others).