nim-lang / RFCs

A repository for your Nim proposals.
135 stars 26 forks source link

Allow unchecked assign to discriminant fields using `{.cast(uncheckedAssign).}` #407

Closed haxscramper closed 2 years ago

haxscramper commented 2 years ago

Introduce {.cast(uncheckedAssign).} that would allow unchecked assignment to the discriminant fields, and mutable access. This is required for saner serialization of the case objects and assignment to {.requiresinit.} fields.

{.cast(uncheckedAssign).}:
  # in this section discriminant fields behave like regular ones
  # you can assign them, pass as `var T` and so on

Serialization

Loading case object from buffer - it is not possible to just result = default(T) and then change fields, I need to somehow initialize the object in the correct state from the start, and then assign fields to it.

import std/strutils

type
  VariantKind = enum vkFirst, vkSecond
  Variant = object
    case kind: VariantKind
      of vkFirst: 
        first: int

      of vkSecond:
        second: string

proc load(buffer: string, target: var VariantKind) = 
  target = parseEnum[VariantKind](buffer)

proc load(buffer: string, target: var Variant) = 
  # required type for target: var VariantKind
  # but expression 'target.kind' is immutable, not 'var'
  load(buffer, target.kind)

Instead of simple load I now have to generate (switching from simple proc to macro codegen just to overcome this safety feature). If assignment to discriminant fields was unchecked, serialization would be as simple as

for name, field in fieldPairs(target):
    write(buffer, field)

for name, field in fieldPairs(target):
    load(buffer, pos, field)

Right now, if I want to load kind object from the stream I need to do this

when name == "kind":
  block:
    var kind: VariantKind
    load(buffer, pos, kind)
    setKind(target, kind, kind)

Where setKind is a horrible, C-backend specific (VM? Js? of course not) hack

template setKind*[T, K](t: var T, kind: untyped, target: K) =
  cast[ptr K](cast[int](addr t) + offsetOf(typeof(t), kind))[] = target

It does allow reading/writing case objects easily from stream, but it is not portable in any way.

```nim import std/[strutils, macros] type VariantKind = enum vkFirst, vkSecond VariantKind2 = enum vk2First, vk2Second Variant = object case kind: VariantKind of vkFirst: first: int of vkSecond: second: string case kind2: VariantKind2 of vk2First: first2: int of vk2Second: second2: string template setKind*[T, K](t: var T, kind: untyped, target: K) = cast[ptr K](cast[int](addr t) + offsetOf(typeof(t), kind))[] = target proc load(buffer: seq[string], pos: var int, target: var int) = target = parseInt(buffer[pos]) inc pos proc load(buffer: seq[string], pos: var int, target: var string) = target = buffer[pos] inc pos proc load[E: enum](buffer: seq[string], pos: var int, target: var E) = target = parseEnum[E](buffer[pos]) inc pos proc load(buffer: seq[string], pos: var int, target: var Variant) = expandMacros: for name, field in fieldPairs(target): # This part requires codegen base on `typed` macros when name == "kind": block: var kind: VariantKind load(buffer, pos, kind) # C-specific hack, would not work on any other backend setKind(target, kind, kind) elif name == "kind2": block: var kind: VariantKind2 load(buffer, pos, kind) setKind(target, kind2, kind) else: load(buffer, pos, field) proc write[T](buffer: var seq[string], target: T) = buffer.add $target proc write(buffer: var seq[string], target: Variant) = for name, field in fieldPairs(target): buffer.write(field) var buffer: seq[string] let objFrom = Variant(kind: vkSecond, second: "string") buffer.write(objFrom) var target: Variant var pos: int = 0 buffer.load(pos, target) echo objFrom echo target echo buffer ``` ``` (kind: vkSecond, second: "string", kind2: vk2First, first2: 0) (kind: vkSecond, second: "string", kind2: vk2First, first2: 0) @["vkSecond", "string", "vk2First", "0"] ``` `fieldPairs` in `load` does the right thing here *by default*, because it effectively expands to field iteration in the same order as `write` proc. ```nim block: var kind: VariantKind load(buffer, pos, kind) setKind(target, kind, kind) case target.kind of vkFirst: load(buffer, pos, target.first) of vkSecond: load(buffer, pos, target.second) block: var kind: VariantKind2 load(buffer, pos, kind) setKind(target, kind2, kind) case target.kind2 of vk2First: load(buffer, pos, target.first2) of vk2Second: load(buffer, pos, target.second2) ```

Initialization with {.requiresinit.}

I will start this part with a concrete example -

type
  Requires {.requiresinit.} = object

  Obj = object
    case kind: bool
      of true: 
        req: Requires

      of false: discard

func build(inKind: bool): Obj = 
  # I need to init `Obj` with a runtime value for a discriminant, but it is not possible
  # because

  result = Obj(kind: inKind) # Error: The Obj type requires the following fields to be initialized: req.

  result = Obj(kind: inKind, req: default(Requires)) # Error: cannot prove that it's safe to initialize 'req' with the runtime value for the discriminator 'kind'

  # I'm cornered by 'requires init' and 'cannot change discriminant', so I need hack around this
  # limitation using  ... good old `setKind`, which does not 
  setKind(result, kind, inKind)
  if result.kind == true:
    result.req = default(Requires)

  # Which is possible on a C backend, not on other ones.

Constructing objects several times in different branches is possible in theory, but this would require another boilerplate patch, and I would prefer to avoid this if possible. More specifically - I would have to explicitly initialize each possible value of discriminant and do it separately, so this would lead to code that looks like this (real-world example):

  case failKind:
    of tfkStructDiff:
      result = TestReport(
        kind: trkCheckFail, 
        structDiff: default(PprintTree), # Requiresinit field that depends on `failKind`
        failKind: tfkStructDiff)

    of tfkStructEq:
      result = TestReport(
        kind: trkCheckFail, 
        structDiff: default(PPrintTree),
        failKind: tfkStructEq)

   # 12 more repetitions for values that don't require `structDiff` 
   # (the 'must init' field). I can't just put `failKind: failKind`
   # because prover won't accept this. -
   # `type requires the following fields to be initialized: structDiff.`

While cast(...) for this particular problem might seem like a hack, the other alternative would be to introduce typestate analysis that would figure out the set of all possible initialization values, and act based on that. Which is not impossible, but is clearly harder to implement. This would be a nice addition to not nil etc., this is out of the scope of this RFC.

```nim type NodeKind = enum foo bar bif baz bad Node = object case kind: NodeKind of foo, bar, baz: children: seq[Node] of bad, bif: dad: Node case kind: of foo: additionalFieldForFoo: int of bar: onlyValidForBar: float else: discard case kind: of foo, bar: validForFooAndBar: string else: discard proc node(kind: NodeKind) = case kind: of foo: # safe to init `children`, `additionalFieldForFoo`, `validForFooAndBar` # Assigning to any other field would be a compile-time error # with 'cannot init'. With `{.cast(uncheckedAssign).}` it would turn into # a warning? hint? nothing? of bar: # safe to init `children`, `onlyValidForBar`, `validForFooAndBar` of bif: # safe to init `dad` of baz: # safe to init `children` of bad: # safe to init `dad` # When trying to init `additionalFieldForFoo` should error out (as it does now), # with something like 'cannot prove init safety, with unchecked runtime value # for discriminant - possible values are ```

Related

Compared to the alternative proposal, this RFC leaves the safety considerations to the programmer and simply provides a tool to just do it, without additional implicit behaviors, like wiping branch fields.

Araq commented 2 years ago

1.6 will ship with this feature.