ponylang / ponyc

Pony is an open-source, actor-model, capabilities-secure, high performance programming language
http://www.ponylang.io
BSD 2-Clause "Simplified" License
5.71k stars 415 forks source link

Segfault when accessing a union in a tuple via `Any` #4507

Open nisanharamati opened 6 months ago

nisanharamati commented 6 months ago

Repro

Minimal version

actor Main
  let env: Env
  new create(env': Env) =>
    env = env'
    let x = ("hello", USize(123))
    match_x(x)            // segfault

  fun match_x(x: Any val) =>
    env.out.print("match_x(x: Any val): ")
    // segfault
    match x
    | let s: Stringable => env.out.print("\t" + s.string())
    | (let a: Stringable, let b: (USize | U64)) =>
      env.out.print("\t(" + a.string() + ", " + b.string() +")")
    else
      env.out.print("\tno-match")
    end

Some more cases

actor Main
  let env: Env
  new create(env': Env) =>
    env = env'
    let str = "string"

    env.out.print("param is String: " + str)
    // works fine
    env.out.print("inline match")
    match str
    | let s: Stringable => env.out.print("\t" + s.string())
    end

    match_usize(str)        // fine
    match_number(str)       // fine
    match_stringable(str)   // fine

    let x = ("hello", USize(123))
    env.out.print("\n\nparam is (String, USize): (" + x._1 + ", " + x._2.string()
                  + ")")
    // works fine
    env.out.print("inline match")
    match x
    | (let a: Stringable, let b: Stringable) =>
      env.out.print("\t(" + a.string() + ", " + b.string() +")")
    end

    match_usize(x)        // fine
    match_number(x)       // segfault
    match_stringable(x)   // segfault

  fun match_stringable(p: Any val) =>
    env.out.print("match_stringable(p: Any): ")
    match p
    | let s: Stringable => env.out.print("\t" + s.string())
    | (let a: Stringable, let b: Stringable) =>
      env.out.print("\t(" + a.string() + ", " + b.string() + ")")
    else
      env.out.print("\tno-match")
    end

  fun match_number(p: Any val) =>
    env.out.print("match_number(p: Any): ")
    match p
    | let s: Stringable => env.out.print("\t" + s.string())
    | (let a: Stringable, let b: Number) =>
      env.out.print("\t(" + a.string() + ", " + b.string() + ")")
    else
      env.out.print("\tno-match")
    end

  fun match_usize(p: Any val) =>
    env.out.print("match_usize(p: Any): ")
    match p
    | let s: Stringable => env.out.print("\t" + s.string())
    | (let a: Stringable, let b: USize) =>
      env.out.print("\t(" + a.string() + ", " + b.string() + ")")
    else
      env.out.print("\tno-match")
    end

OS: MacOS, M3 Pony version

% ponyc -v
0.58.3-cb2f814b [debug]
Compiled with: LLVM 15.0.7 -- AppleClang-15.0.0.15000309-arm64
Defaults: pic=true

Backtrace (of the minimal version)

% lldb -- ./12_segfault_tuple
(lldb) target create "./12_segfault_tuple"
Current executable set to 'scratch/pony/12_segfault_tuple/12_segfault_tuple' (arm64).
(lldb) r
Process 21124 launched: 'scratch/pony/12_segfault_tuple/12_segfault_tuple' (arm64)
match_x(x: Any val):
Process 21124 stopped
* thread #4, stop reason = EXC_BAD_ACCESS (code=1, address=0x7b)
    frame #0: 0x000000010000537c 12_segfault_tuple`___lldb_unnamed_symbol3694 + 372
12_segfault_tuple`___lldb_unnamed_symbol3694:
->  0x10000537c <+372>: ldr    x8, [x0]
    0x100005380 <+376>: b      0x100005418               ; <+528>
    0x100005384 <+380>: ldr    x0, [x8, x10]
    0x100005388 <+384>: ldr    x8, [x0]
Target 0: (12_segfault_tuple) stopped.
(lldb) bt
* thread #4, stop reason = EXC_BAD_ACCESS (code=1, address=0x7b)
  * frame #0: 0x000000010000537c 12_segfault_tuple`___lldb_unnamed_symbol3694 + 372
    frame #1: 0x0000000100009518 12_segfault_tuple`handle_message(ctx=0x00000001084e9e08, actor=0x00000001084e8c00, msg=0x00000001084e9300) at actor.c:507:7
    frame #2: 0x0000000100008b20 12_segfault_tuple`ponyint_actor_run(ctx=0x00000001084e9e08, actor=0x00000001084e8c00, polling=false) at actor.c:592:20
    frame #3: 0x00000001000230c8 12_segfault_tuple`run(sched=0x00000001084e9dc0) at scheduler.c:1075:23
    frame #4: 0x0000000100022500 12_segfault_tuple`run_thread(arg=0x00000001084e9dc0) at scheduler.c:1127:3
    frame #5: 0x00000001804e6f94 libsystem_pthread.dylib`_pthread_start + 136
SeanTAllen commented 6 months ago

The match itself isn't the issue. That is fine. In the original example.

This works:

actor Main
  let env: Env
  new create(env': Env) =>
    env = env'
    let x = ("hello", USize(123))
    match_x(x)            // segfault

  fun match_x(x: Any val) =>
    env.out.print("match_x(x: Any val): ")
    // segfault
    match x
    | let s: Stringable => env.out.print("\t" + s.string())
    | (let a: Stringable, let b: USize) =>
      env.out.print("\t(" + a.string() + ", " + b.string() +")")
    else
      env.out.print("\tno-match")
    end

So does this:

actor Main
  let env: Env
  new create(env': Env) =>
    env = env'
    let x = ("hello", USize(123))
    match_x(x)            // segfault

  fun match_x(x: Any val) =>
    env.out.print("match_x(x: Any val): ")
    // segfault
    match x
    | let s: Stringable => env.out.print("\t" + s.string())
    | (let a: Stringable, let b: (USize | U64)) =>
      env.out.print("hello")
    else
      env.out.print("\tno-match")
    end

The issue in the first match is with the combination of a match through the union AND the message send. What exactly is open to question.

The second example is the same issue. It is matching on Number that is type Number is (Int | Float).

So there's something with the tuple being matched against a union and then PROBABLY the usage of the matched in a message send. More investigation is needed to know if it is message send is in fact fully required. Simple testing says yes.

Here's a more minimal example:

actor Main
  let _env: Env

  new create(env: Env) =>
    _env = env
    match_x(("hello", USize(123)))

  fun match_x(x: Any val) =>
    match x
    | (let a: Stringable, let b: (USize | U64)) =>
      _env.out.print("\t(" + b.string() +")")
    end

Note the following where the match is against the known tuple types is fine:

actor Main
  let _env: Env

  new create(env: Env) =>
    _env = env
    match_x(("hello", USize(123)))

  fun match_x(x: (Stringable, (USize | U64))) =>
    match x
    | (let a: Stringable, let b: (USize | U64)) =>
      _env.out.print("\t(" + b.string() +")")
    end
SeanTAllen commented 6 months ago

Here's an even more minimal example:

actor Main
  new create(env: Env) =>
    let x: Any val = ("hello", (USize(123)))
    match x
    | (let a: Stringable, let b: (USize | U64)) =>
      env.out.print(b.string())
    end
SeanTAllen commented 6 months ago

The message send isn't required. Here's a still more minimal example:

actor Main
  new create(e: Env) =>
    let x: Any val = ("a", (U8(1)))
    match x
    | (let a: String, let b: (U8 | U16)) =>
      b.string()
    end

At this point what seems to be important is the Any val when matching and the union type that we for b and then, calling string on the matched value.

Whoever picks this up can work more from there.

jemc commented 6 months ago

Discussed in the sync call.

Listen to the sync call recording for a more detailed explanation, but what's going on here is:

To fix this, we would need the compiled code inside the match to recognize the subtle distinction between the two different tuple shapes, and generate different code for different tuple shape cases, with the one case we need here being a case where we need to generated a boxed pointer for b, before we can try to treat it as a pointer.

This is non-trivial, but should be possible.