Closed Araq closed 3 years ago
Isolated
itself is a good idea, but restricting channel to isolated
is a bad idea. We should be able to use channel not only to move data between threads, but also share data among threads. If channel only accepts isolated
, we cannot send immutable-read-only data that can be read by many threads.
"sendable" should also include immutable
besides isolated
.
But you can do that via atomic refcounting.
If T does not contain a ref or closure type, it is isolated.
The implementation of an atomic pointer uses ptr
, not ref
so it does count as Isolated
.
Just bikeshedding: The name recover
seems confusing on first glance, what is it recovering exactly? isolate
would play nice with the type name.
Just bikeshedding: The name recover seems confusing on first glance, what is it recovering exactly? isolate would play nice with the type name.
Well, recover
has been borrowed from pony. The function name is disturbing though : It suggests that something needs to be repaired. Indeed, isolate
explains it better. BTW, I propose to shorten Isolated
a bit, simply to
iso
. (Another borrowing from pony, I frequently used it in the Nim forum, others did the same.... )
In the implemenation I used the name isolate
.
BTW, I propose to shorten Isolated a bit, simply to iso. (Another borrowing from pony, I frequently used it in the Nim forum, others did the same.... )
I would like to push back against this. Let's make our code readable instead of easy to write.
BTW, I propose to shorten Isolated a bit, simply to iso. (Another borrowing from pony, I frequently used it in the Nim forum, others did the same.... )
I would like to push back against this. Let's make our code readable instead of easy to write.
Plus, we have type aliases. If you have channel-heavy code, you can just do type Iso = Isolated
.
"Iso" by itself merely means "equal", which is rather confusing to newcomers.
Plus, we have type aliases. If you have channel-heavy code, you can just do type Iso = Isolated. "Iso" by itself merely means "equal", which is rather confusing to newcomers.
Well, I followed the pony-lang approach. Honestly, the least thing I was confused about pony was the iso
keyword that comes with the language. I can offer uno
instead, unique node pointer
. A pointer that may not be shared. ( where "Node" is any mem struct on the heap that can have refs at its own).
There is another point. iso
can be seen as a primitive boxing of ref
- boxing without runtime costs (it's not java auto-boxing, beware...) . It indicates more or less the intention of the programmer. The compiler can then infer if the iso
property is statically maintained or not, allowing for very basic operations at the moment, e.g. for updates of values like float64 or so. If the compiler can't prove it , it will reset the iso
to ref
. It is very similar to the was moved
property. If then a proc demands iso
, the compiler will coerce automatically - inserting a runtime check with isolate
- or the programmer him/herself can do it. The one-dollar question is: What is the appropriate behavior if isolate
fails? In specific cases, implicit copy could be an option.
"Isolated" is an encapsulating type. Nothing wrong with it, but there will be more than one Isolated
, because there are many concepts around. If we already had a stable concept for "Isolated" then we already could find it in almost all established languages. So, fine-graining of "Isolated" should be possible, it should kept open for individual demands and solutions. No perfect world.
For the moment, iso
is a simple extension of ref
, a single bit away from it only. The runtime coercion will do the trick.
I expect the name Isolated[T]
not to be used often enough to justify a shorter name. template send*[T](c: var Channel[T]; msg: sink T)
already hides it successfully. It's time to play with the implementation.
Spec Defect: f must be .noSideEffect, not .gcsafe! See bug #18326
Here is another issue, I think, that I encountered. Suppose a cow string implementation, passing a string to another thread via isolate does not enforce uniqueness:
var a: Isolated[String]
var b: String
b.add 'w'
b.add 'o'
a = isolate b # wrong! needs to perform a deepcopy
#b.add 'r'
let c = extract a
echo cast[ByteAddress](c.p)
echo cast[ByteAddress](b.p) # the addresses are the same
Implementation: https://github.com/planetis-m/dumpster/blob/master/cowstrings.nim Maybe an isolate constructor that does =deepcopy is needed?
This is an alternative that works:
func isolate(value: String): Isolated[String] =
var a: String
# do a deepCopy here
result = unsafeIsolate a
Or maybe override isolate
and unsafeIsolate
for String
?
Or maybe override
isolate
andunsafeIsolate
forString
?
Yes that's what I did. Except unsafeIsolate since Ive no other way of constructing an Isolate[String].
f
's return type cannot aliasx
's type. This is checked via a form of alias analysis as explained in the next paragraph.
What if f
's return type can't alias x
's type, but can alias a subgraph of x
. Don't you have to check that none of the deeply nested ref
s that could be held within x
can't make their way into the return type?
Don't you have to check that none of the deeply nested refs that could be held within x can't make their way into the return type?
Yes, indeed and I hope the implementation does that. :-)
I don't think it does:
import std/isolation
type Z = ref object
i: int
type A = object
z: Z
type B = object
z: Z
func a_to_b(a: A): B =
result = B(z: a.z)
let a = A(z: Z(i: 3))
let b = isolate(a_to_b(a))
echo repr(b) # [value = [z = ref 0x7f8650b6b050 --> [i = 3]]]
inc a.z.i
echo repr(b) # [value = [z = ref 0x7f8650b6b050 --> [i = 4]]]
Even beyond overlapping subgraphs, it doesn't look like returning an entire ref subgraph is detected?
import std/isolation
type Z = ref object
i: int
type A = object
z: Z
func a_to_z(a: A): Z =
result = a.z
let a = A(z: Z(i: 3))
let z = isolate(a_to_z(a))
echo repr(z) # [value = ref 0x7fca2c6d9050 --> [i = 3]]
inc a.z.i
echo repr(z) # [value = ref 0x7fca2c6d9050 --> [i = 4]]
Only the reverse case seems to work:
import std/isolation
type Z = ref object
i: int
type A = object
z: Z
func z_to_a(z: Z): A =
result = A(z: z)
let z = Z(i: 3)
let a = isolate(z_to_a(z)) # Error: expression cannot be isolated: z_to_a(z)
Please report this as a bug on Nim's issue tracker.
Done ↑
Separately, I think it would be nice if there was a version of this function that created fresh copies of all ref
s with refcount > 1. (In fact maybe that should be the default behavior?)
Separately, I think it would be nice if there was a version of this function that created fresh copies of all refs with refcount > 1. (In fact maybe that should be the default behavior?)
Variations of this idea have been thought about, but it changes the behavior from O(1) to O(n) and implicit copies of ref
's are too confusing in practice. Which is why we moved away from deepCopying things.
Can't we have an additional function that does this copy optionally? Presumably a macro could statically write the sequence of refcount checks that must be performed and if they are self-contained no copying is required?
Proposal to add the concept of 'isolated' data to Nim
This is the evolution of
owned ref
. A new name,Isolated
was chosen in order to avoid confusions and also because it can be done almost entirely as a library without bloating Nim's core type system further.Motivation
We want to be able to pass subgraphs to threads, safely and easily. All data races should be prevented at compile-time. We are not willing to pay the price of atomic reference counting in order to do so. We also seek to avoid Rust's and C++'s many different pointer types. Nim uses
ref
everywhere andref
should remain the default pointer type to use. It is safe, efficient and Nim's optimizers understand it well.Yet,
ref
has no concept of unique ownership which is required for effective message passing without copies. Hence we wrap it inside anIsolated[T]
:Isolated[T]
is what a channel should use, comparable to Rust's "sendable" trait:How to construct isolated subgraphs
Construction must ensure that the invariant holds, namely that the wrapped
T
is free of external aliases into it. To ensure it, we propose a Pony-inspiredrecover
construct, but namedisolate
for clarity:As you can see, this is a new builtin because the check it performs on
x
is non-trivial:If
T
does not contain aref
orclosure
type, it is isolated. Else the syntactic structure ofx
is analyzed:nil
,4
,"abc"
are isolated.[x...]
is isolated if every elementx
is isolated.Obj(x...)
is isolated if every elementx
is isolated.if
orcase
expression is isolated if all possible values the expression may return are isolated.C(x)
is isolated ifx
is isolated. Analogous forcast
expressions.x.field
is isolated ifx
is isolated.x[i]
is not isolated ifx
is isolated as otherwise code likea.send x[i]; b.send x[i]
(send to two different channels) might compile.f(x...)
is isolated iff
is.noSideEffect
and for every argumentx
:x
is isolated orf
's return type cannot aliasx
's type. This is checked via a form of alias analysis as explained in the next paragraph.Note: Previously the spec said that
f
must be.gcsafe
, this is not sufficient, we cannot guarantee isolation for a .threadvar location.Alias analysis
We start with an important, simple case that must be valid: Sending the result of
parseJson
to a channel. Since the signature isfunc parseJson(input: string): JsonNode
it is easy to see that JsonNode can never simply be a view intoinput
which is astring
.A different case is the identity function
id
,send id(myJsonGraph)
must be invalid because we do not know how many aliases intomyJsonGraph
exist elsewhere.In general type
A
can alias typeT
if:A
andT
are the same types.A
is a distinct type derived fromT
.A
is a field insideT
ifT
is a final object type.T
is an inheritable object type. (An inherited type could always contain afield: A
).T
is a closure type. Reason:T
's environment can contain a field of typeA
.A
is the element type ofT
ifT
is an array, sequence or pointer type.These rules ensure the freedom of potential data races but they can be quite limiting. It remains to be seen if they suffice in practice. Pony's
recover
is actually not a function, but a block of code. It is unclear at this point if that really adds expressivity or not over this proposed builtin function.Sugar
We expect the pattern
send(isolate(f()))
to be very common so we add a template overload to the channel:Example: send JsonNodes to a worker thread
Next steps
There is now an implementation available in
std/isolation
. The channel type should useIsolated[T]
for ARC/ORC and we need to see how it works.Future directions
Isolated graphs can also be checked for isolation at runtime via something like
func assertIsolated[T](graph: sink T): Isolated[T]
. This would use ORC's mechanism for graph traversal and the involved reference counts to determine that no external pointers intograph
exist. Since this is a runtime check and not a compile-time guarantee, we will first do without such a mechanism.