nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.22k stars 1.46k forks source link

Some C struct can have nested union. This is not supported for now #23709

Open Alogani opened 2 weeks ago

Alogani commented 2 weeks ago

Summary

Hello,

This is a followup of #23700. Today, union defined inside a struct cannot be expressed correctly in nim.

Description

Sometimes, C struct definition can have a nested union, like that:

struct sigaction {
    union {
        void    (*__sa_handler)(int);
        void    (*__sa_sigaction)(int, siginfo_t *, void *);
    } __sigaction_u;
    sigset_t sa_mask;          /* signal mask to apply */
    int  sa_flags;         /* see signal options below */
};

The actual solutions

This struct cannot be expressed in nim without messing with the struct layout. For example those examples won't work :

Working, but might mess up something ??

type Sigaction* {.importc: "struct sigaction", header: "<signal.h>", final, pure.} = object
    sa_handler*: proc (x: cint) {.noconv.}
    sa_sigaction*: proc (x: cint, y: ptr SigInfo, z: pointer) {.noconv.}
    sa_mask*: Sigset
    sa_flags*: cint

{.emit: """
void modify(struct sigaction* action) {
   action->sa_flags = 42;
}
""".}

proc modify(action: ptr Sigaction) {.importc.}

var a = Sigaction()
modify(addr a)
echo a.sa_flags

Not working

type Sigaction_internal {.union.} = object
    sa_handler*: proc (x: cint) {.noconv.}
    sa_sigaction*: proc (x: cint, y: ptr SigInfo, z: pointer) {.noconv.}

type Sigaction* {.importc: "struct sigaction", header: "<signal.h>", final, pure.} = object
    internal: Sigaction_internal
    sa_mask*: Sigset
    sa_flags*: cint

Propositions

Either by using a pragma

This is IMHO the best solution

type Sigaction* {.importc: "struct sigaction", header: "<signal.h>", final, pure.} = object
    {.union.}:
        sa_handler*: proc (x: cint) {.noconv.}
        sa_sigaction*: proc (x: cint, y: ptr SigInfo, z: pointer) {.noconv.}
    sa_mask*: Sigset
    sa_flags*: cint

Or by using a deported object

type Sigaction_internal {.union.} = object
    sa_handler*: proc (x: cint) {.noconv.}
    sa_sigaction*: proc (x: cint, y: ptr SigInfo, z: pointer) {.noconv.}

type Sigaction* {.importc: "struct sigaction", header: "<signal.h>", final, pure.} = object
    Sigaction_internal 
    sa_mask*: Sigset
    sa_flags*: cint

(Pleast note that concerning Sigaction, as pointed out by litlighilit, a workaround using template allows this to work in actual implementation)

litlighilit commented 2 weeks ago

Why

type Sigaction_internal {.union.} = object
    sa_handler*: proc (x: cint) {.noconv.}
    sa_sigaction*: proc (x: cint, y: ptr SigInfo, z: pointer) {.noconv.}

type Sigaction* {.importc: "struct sigaction", header: "<signal.h>", final, pure.} = object
    internal: Sigaction_internal
    sa_mask*: Sigset
    sa_flags*: cint

not working?

Alogani commented 2 weeks ago

How could it ?

Example 1:

include mycustomSigaction

var a = Sigaction()
a.sa_handler = proc(x: cint) {.noconv.} = echo "catch", x

Output:

Error: undeclared field: 'sa_handler=' for type in.Sigaction [type declared in /usercode/in.nim(9, 6)]

Example 2:

include mycustomSigaction

var a = Sigaction()
a.internal.sa_handler = proc(x: cint) {.noconv.} = echo "catch", x

Output:

In file included from /usercode/nimcache/@min.nim.c:8:
/usercode/nimcache/@min.nim.c:43:40: error: expected ':', ',', ';', '}' or '__attribute__' before '.' token
   43 |         tyProc__KdSUXb6rBxR37qf0lYK7pQ sa_handler;
      |                                        ^~~~~~~~~~
/usercode/nimcache/@min.nim.c: In function 'NimMainModule':
/usercode/nimcache/@min.nim.c:172:50: error: 'struct sigaction' has no member named 'internal'
  172 |         nimlf_(15, "/usercode/in.nim"); a__in_u17.internal.sa_handler = colonanonymous___in_u18;
      |      
litlighilit commented 2 weeks ago

I didn't realize it until now that

Nim's offsetof is also mapped to C's operator.

The field access is mapped to C's too, instead of accessing via calculated offset, which is what I used to think.

Then I just cannot give a real-world example where the fields' order of a imported struct or whether they contains union matters.

e.g. Even though in my system, Uid is of wrong size, (shall be 4, but 8 in Nim), nowadays, the code used to cause bug now works fine... (may because Nim used to calculate size itself, but now it uses C's sizeof)

Alogani commented 2 weeks ago

@litlighilit don't know about future development on Nim's NIR or nlvm project. But if Nim's import are close to C definitions, this will make life easier on future. But I don't know more than you do

litlighilit commented 2 weeks ago

For this case, there is a solution:

type Sigaction_internal {.union.} = object
    sa_handler*: proc (x: cint) {.noconv.}
    sa_sigaction*: proc (x: cint, y: ptr SigInfo, z: pointer) {.noconv.}

type Sigaction* {.importc: "struct sigaction", header: "<signal.h>", final, pure.} = object
    internal{.importc: "__sigaction_u".}: Sigaction_internal
    sa_mask*: Sigset
    sa_flags*: cint

But when it comes to anonymous union, there may be really no support yet...

So the {.union.}: ... pragma statement can be introduced to solve this, but it shall be for anonymous union only.

Edit: not working for this case, see below for details.

Alogani commented 2 weeks ago

Have you tested it ? ^^

type Sigset* {.importc: "sigset_t", header: "<signal.h>", final, pure.} = object
    abi: array[1024 div (8 * sizeof(culong)), culong]

type Sigaction_internal {.union.} = object
    sa_handler*: proc (x: cint) {.noconv.}
    sa_sigaction*: proc (x: cint, y: pointer, z: pointer) {.noconv.}

type Sigaction* {.importc: "struct sigaction", header: "<signal.h>", final, pure.} = object
    internal{.importc: "__sigaction_u".}: Sigaction_internal
    sa_mask*: Sigset
    sa_flags*: cint

var a = Sigaction()
a.internal.sa_handler = proc(x: cint) {.noconv.} = echo "catch", x

Output

Hint: used config file '/playground/nim/config/nim.cfg' [Conf]
Hint: used config file '/playground/nim/config/config.nims' [Conf]
......................................................................
Hint: gcc -c  -w -fmax-errors=3 -pthread   -I/playground/nim/lib -I/usercode -o /usercode/nimcache/@m..@splayground@snim@slib@ssystem@sexceptions.nim.c.o /usercode/nimcache/@m..@splayground@snim@slib@ssystem@sexceptions.nim.c [Exec]
Hint: gcc -c  -w -fmax-errors=3 -pthread   -I/playground/nim/lib -I/usercode -o /usercode/nimcache/@m..@splayground@snim@slib@sstd@sprivate@sdigitsutils.nim.c.o /usercode/nimcache/@m..@splayground@snim@slib@sstd@sprivate@sdigitsutils.nim.c [Exec]
Hint: gcc -c  -w -fmax-errors=3 -pthread   -I/playground/nim/lib -I/usercode -o /usercode/nimcache/@m..@splayground@snim@slib@ssystem@sdollars.nim.c.o /usercode/nimcache/@m..@splayground@snim@slib@ssystem@sdollars.nim.c [Exec]
Hint: gcc -c  -w -fmax-errors=3 -pthread   -I/playground/nim/lib -I/usercode -o /usercode/nimcache/@m..@splayground@snim@slib@ssystem.nim.c.o /usercode/nimcache/@m..@splayground@snim@slib@ssystem.nim.c [Exec]
Hint: gcc -c  -w -fmax-errors=3 -pthread   -I/playground/nim/lib -I/usercode -o /usercode/nimcache/@min.nim.c.o /usercode/nimcache/@min.nim.c [Exec]
In file included from /usercode/nimcache/@min.nim.c:8:
/usercode/nimcache/@min.nim.c:42:40: error: expected ':', ',', ';', '}' or '__attribute__' before '.' token
   42 |         tyProc__KdSUXb6rBxR37qf0lYK7pQ sa_handler;
      |                                        ^~~~~~~~~~
/usercode/nimcache/@min.nim.c: In function 'NimMainModule':
/usercode/nimcache/@min.nim.c:171:50: error: 'struct sigaction' has no member named '__sigaction_u'
  171 |         nimlf_(14, "/usercode/in.nim"); a__in_u16.__sigaction_u.sa_handler = colonanonymous___in_u17;
      |             

And even if it worked, having to do a.internal.sa_handler whereas in c, it is a.sa_handler is a problematic workaround IMHO

litlighilit commented 2 weeks ago

Have you tested it ? ^^

I really didn't...

as in my current linux machine, the union is defined as an anonymous union...

It's really another big problem that different posix has different name for definition.


Okey, in C sa_hander is defined as a macro[^1] to something like __sigaction_handler.sa_handler to support writing C code like s.sa_hanler,

therefore the sa_handler symbol is just polluted, leading code above no working.

For this case (and in fact for many other), such a workaround is invalid.

[^1]: in fact in Nim we are using the similiar mechanism (via template/proc as getter/setter), but thanks to it that Nim's template can be typed and more flexible, such a problem won't be caused in Nim.