Closed krux02 closed 2 years ago
This does seem like a real issue. Araq has said previously stated that the spec states that assigning to let from a global should copy (or at least have copy semantics): https://github.com/nim-lang/Nim/issues/13140#issuecomment-574284278
--gc:arc
fixes that.let
was introduced to mimic parameter passing semantics that also don't copy.Are you saying that this bug doesn't need to be fixed, because there exists a compiler flag (with btw many other problems) where this bug does not appear?
1. `--gc:arc` fixes that. 2. The spec changed, originally `let` was introduced to mimic parameter passing semantics that also don't copy.
1.) Are we not supporting the other gcs now?
2.) What did the spec change into? Because this behavior is unintuitive and not obvious.
I'm saying the "bug" has a history, is well known and now with move semantics we can finally fix it, even though the solution is still incomplete without @timotheecour 's .byaddr
.
What did the spec change into? Because this behavior is unintuitive and not obvious.
It needs to cause a copy but doesn't and my very own code relies on the current performance implications.
my very own code relies on the current performance implications.
Then fix the code.
how about following:
--legacy:letDoesNotCopy
let
statement that should copy but doesn't under current semantics up to today:
letDoesNotCopy
is provided, issue a warning warnLetDoesNotCopy
(and maybe suggest to simply add {.byaddr.}
)--gc:arc
semantics, ie, make a copy--legacy:letDoesNotCopy
localizable just like {.experimental:forLoopMacros.}
to allow gradual transition instead of forcing users to fix all their code at once--warning:warnLetDoesNotCopy:off
{.bycopy.}
to make gradual transition easier, eg: let a {.bycopy.} = expr
; {.bycopy.}
would force copy semantics. The pragma is ignored unless we're inside a {.legacy:letDoesNotCopy.}
section. I expect this can be used in a few key performance places. Note: I'm going to use the same thing for https://github.com/nim-lang/Nim/pull/13792; {.legacy:implicitCstringConv.}
Then fix the code.
So predictable. What about code I don't have control over? It's a breaking change, that's all I'm saying. And we collect the new, mildly breaking stuff under --gc:arc
already.
--gc:arc fixes that
in top example but not in following example (adapted from https://github.com/nim-lang/Nim/issues/13140):
doAssert fails with or without --gc:arc
here:
when true:
type Bar = ref object
foo: string
proc main(a: Bar)=
let b = a.foo
doAssert cast[int](a.foo[0].unsafeAddr) != cast[int](b[0].unsafeAddr)
let a = Bar(foo: "abc")
main(a)
@timotheecour I tried your example. After all It uses unsafeAddr
and I don't agree that this is a bug yet. I tried to tried to create an unexpected side effect, and I could not get one.
type Bar = ref object
foo: string
proc main(a: Bar) =
let b = a.foo
echo a.foo[0].unsafeAddr == b[0].unsafeAddr
a.foo[0] = '3'
echo b
echo a.foo
echo a.foo[0].unsafeAddr == b[0].unsafeAddr
let a = Bar(foo: "abc")
main(a)
output:
true
abc
3bc
false
This looks to me like copy on write
@Araq Thanks for providing more context. After your response, I favor a deprecation path approach. Classic case of Hyrum's Law
I'm saying the "bug" has a history, is well known and now with move semantics we can finally fix it, even though the solution is still incomplete without @timotheecour 's
.byaddr
.What did the spec change into? Because this behavior is unintuitive and not obvious.
It needs to cause a copy but doesn't and my very own code relies on the current performance implications.
What about shallowCopy
? Or simply taking a pointer? Both of these mechanisms already exist for just these kinds of optimizations.
Then fix the code.
So predictable. What about code I don't have control over? It's a breaking change, that's all I'm saying. And we collect the new, mildly breaking stuff under
--gc:arc
already.
If this is the case, then I don't see how byaddr
(or any other additions that require explicit code changes) will be the solution then. If our restriction here is "we need something that doesn't require code changes" , then the best solution would be to add to/improve the compiler mechanisms that do memory copy optimization (sink, etc).
Aside from this, I don't consider performance regression a breaking change, especially when it's caused by fixing behavior that's dangerous/unspecified.
Nim already gives you plenty of tools to do dangerous things when you need to. Up until now I thought that the philosophy was "provide safe/sane behavior by default, and then give the users explicit tools to tinker with the inner-workings when they need to". Has this changed?
What about shallowCopy
doesn't help with most types, see example1 and example2 below.
Or simply taking a pointer?
that's more cumbersome to use (you have do deref at each location) and inherently less safe (your pointer can escape, whereas the byaddr
is deref-on-use).
import std/unittest
import std/pragmas
type Foo = object
bar: array[10, int]
age: int
var count = 0
proc getAgeSideEffect(a: var Foo): var int =
count.inc
a.age
proc fun[T](a: var T) =
block:
# example 1:
var b {.byaddr.} = a.bar
doAssert b[0].unsafeAddr == a.bar[0].unsafeAddr
# example 2:
var b2 {.byaddr.} = a.getAgeSideEffect
b2 += 10
doAssert a.age == 10
b2 -= 10
doAssert a.age == 0
doAssert count == 1
block:
var b: type(a.bar)
b.shallowCopy a.bar
check b[0].unsafeAddr == a.bar[0].unsafeAddr # fails
var b2: int
b2.shallowCopy a.getAgeSideEffect
b2+=13
check a.age == 13 # fails
var a: Foo
a.bar[0] = 3
fun(a)
/Users/timothee/git_clone/nim/timn/tests/nim/all/t10471.nim(38, 26): Check failed: b[0].unsafeAddr == a.bar[0].unsafeAddr
/Users/timothee/git_clone/nim/timn/tests/nim/all/t10471.nim(43, 16): Check failed: a.age == 13
a.age was 0
Aside from this, I don't consider performance regression a breaking change, especially when it's caused by fixing behavior that's dangerous/unspecified.
performance regressions are a breaking change (eg: your server used to handle the load and now it doesn't). Furthermore it's not just about performance, it's about ref semantics: whether you are copying or taking a reference affects code semantics.
That's why I'm considering this, at least for transition period:
let a1 {.byaddr.} = expr # reference semantics
let a2 {.bycopy.} = expr # copy semantics
let a3 = expr
# always copy for pointer types (ref/ptr/pointer/cstring), ordinal, float
# warning (or even error) for other types (seq, string, object, set), since behavior differs bw --gc:arc vs other
Up until now I thought that the philosophy was "provide safe/sane behavior by default, and then give the users explicit tools to tinker with the inner-workings when they need to". Has this changed?
No, on the contrary, we are about to create a copy by default because that's safer to do. We don't do it as a "bugfix" though because O(1) vs O(n) behaviour is not a "bugfix".
@Varriount
Aside from this, I don't consider performance regression a breaking change, especially when it's caused by fixing behavior that's dangerous/unspecified.
I have to agree with Timothee here. Performance can be a bug. It is important. What is special about this case is that it is performance vs. correctness. It's not just a performance regression. It's a semantic change. On this part, I think you and I agree.
In general, I agree, correctness is more important. But, in the real world, we can't just suddenly change the behavior. You must provide a deprecation path.
As a person who works with corporate level software that depends on performance characteristics for their day job, I can tell you, people get really mad when you suddenly change the performance without warning (even in the name of correctness.) And everyone gets mad when semantics they rely on suddenly change (even if those semantics were incorrect. See my above comment about Hyrum's law.)
Nim already gives you plenty of tools to do dangerous things when you need to. Up until now I thought that the philosophy was "provide safe/sane behavior by default, and then give the users explicit tools to tinker with the inner-workings when they need to".
I'm a bit naive when it comes to the {.byaddr.}
and {.bycopy.}
pragmas... I know Krux doesn't like it. Part of this is because adding more features is a maintenance burden in general. I understand that.
But as far as the feature itself, I don't know enough to have an opinion on the semantic pros and cons of those features. I have to trust araq and smarter people (until I have time to research it).
Either way, this feature seems exactly aligned with the idea of "provide safe/sane behavior by default, and then give the users explicit tools to tinker with the inner-workings when they need to".
You don't have to use {.byaddr.}
. It's an advanced feature that you use if you need it. That is exactly in line with Nim's model.
If our restriction here is "we need something that doesn't require code changes" , then the best solution would be to add to/improve the compiler mechanisms that do memory copy optimization (sink, etc).
I don't think this feature blocks the compiler from making internal optimizations in any way for most normal code... It simply provides explicit control for those who want it. (again, I may be wrong here. I haven't fully understood the feature yet.)
IIUC, for the immediate future, araq is fixing the default, and timothee's flag idea is a verbose but reasonable solution for deprecation.
this issue is mis-diagnosed; it has nothing to do with global var since it can be reproduced with assigning a local var to a local let:
when true:
template fn =
var a = @[1,2,3] # with array instead, there'd be no difference bw default gc and gc:arc
let b = a
a[0] += 10
echo (b: b, isCopy: b[0].unsafeAddr != a[0].unsafeAddr)
fn()
proc main = fn()
main()
default gc: (b: @[1, 2, 3], isCopy: true) (b: @[11, 2, 3], isCopy: false)
gc:arc (b: @[1, 2, 3], isCopy: true) (b: @[1, 2, 3], isCopy: true)
instead this looks like a duplicate of https://github.com/nim-lang/Nim/issues/2314 so we could IMO close this as duplicate but it has useful discussions above which should be referenced in https://github.com/nim-lang/Nim/issues/2314
Duplicate of https://github.com/nim-lang/Nim/issues/2314
Function
echo
outputs the wrong string.Example
Current Output
Expected Output
Additional Information
Variants that work expectedly:
use an array istead of a seq
use a var instead of a let:
wrap the seq in a tuple