nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.22k stars 1.46k forks source link

SigAction isn't missing a field on amd64? #23700

Open Alogani opened 3 weeks ago

Alogani commented 3 weeks ago

Description

Hello,

I see the SigAction struct of posix miss the field sa_sigaction*: proc (x: cint, y: ptr SigInfo, z: pointer) {.noconv.} for amd64

https://github.com/nim-lang/Nim/blob/767a901267c782765e010d106fae277b27e04981/lib/posix/posix_linux_amd64.nim#L304-L313

However, I have tested in c, it seems we can define use this field succesfully on my amd64 machine

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <string.h>

#define SEGV_STACK_SIZE (MINSIGSTKSZ + 4096)

void sigsegv_handler(int signum, siginfo_t *info, void *data) {
    void *addr = info->si_addr;
    printf("Segmentation fault from reading the address %p\n", addr);
    exit(1);
}

int main() {
    stack_t segv_stack; // #Stack
    segv_stack.ss_sp = posix_memalign(SEGV_STACK_SIZE); //#-> to import, included already in stdlib.h
    segv_stack.ss_flags = 0;
    segv_stack.ss_size = SEGV_STACK_SIZE;
    sigaltstack(&segv_stack, NULL); // #sigaltstack

    struct sigaction action; // #Sigaction
    action.sa_flags = SA_SIGINFO | SA_ONSTACK;
    action.sa_sigaction = &sigsegv_handler;
    sigaction(SIGSEGV, &action, NULL); // #sigaction

    printf("Installed signal handler for SIGSEGV.\n");

     int* ptr = NULL;
    *ptr = 42;

    return 0;
}

Nim Version

v2.0.4 Fedora, on amd64

Current Output

No response

Expected Output

No response

Possible Solution

No response

Additional Information

No response

juancarlospaco commented 3 weeks ago

Make a pull request? 🤔

Alogani commented 2 weeks ago

Yeah sure ! I was hesitant out of fear that something might explode or I don't know what, I have never contributed before 🙂

litlighilit commented 2 weeks ago

For struct sigaction, POSIX manual[^pubs] has such declaration:

On some architectures a union[^eg] is involved: do not assign to both sa_handler and sa_sigaction.

We can tell two points from it:

  1. sa_sigaction field is a must for struct sigaction
  2. but it may be overlapped, as it may be in a union. In Nim, it may not mean "missing a field", but using one field to present one union.

Edit:

I later found in the code below the definition, the getter and setter is defined as templates.

[^eg]: For example, SigAction definition in posix_macos_amd64.nim uses one field of function pointer, that's, on MacOS, according to Apple doc, sa_sigaction and sa_handler are defined as one union (by defining only one field, its layout is right, as union of two pointer is still one-pointer length), though its fields' order is different from linux's.

[^pubs]: formal specification here

litlighilit commented 2 weeks ago

@Alogani

Could you please refer to the <signal.h> header file under your system (Fedora, as you mentioned) about struct sigaction to check if there are unions in it?

I've had a look at Debain's, and there's union in struct sigaction, and the fields' order is the same as nim's definition, so the layout in posix_linux_amd64.nim is correct for such linux.

Different linux distributions may have different definitions.... which requires further inspect and discussion...

litlighilit commented 2 weeks ago

Further steps:

  1. Leave a comment about such a field represent a unoin in some linux like Debain.
  2. Once in one linux distribution or other UNIX like FreeBSD, we find a different layout, add a new posix_*.nim and define a different SigAction object.
litlighilit commented 2 weeks ago

FreeBSD's [^bsd] is defined as follows:

struct sigaction {
    void    (*sa_handler)(int);
    void    (*sa_sigaction)(int, siginfo_t *, void *);
    int     sa_flags;            /* see signal options below */
    sigset_t sa_mask;            /* signal mask to apply */
};

in which case a field is indeed missing.

So may we add a new posix_freebsd*.nim


What's more, OpenBSD's is the same as what's defined in posix_linux_amd64.nim, but in Nim its fields' order is wrong?!

[^bsd]: note FreeBSD is of POSIX, not Linux

Alogani commented 2 weeks ago

Wow, that a very detailed research you did. Awesome. I had put more info in the PR request, but the PR request is wrong indeed as you noticed well. I will rewrite the information i found there below

Here it is for my fedora :

/* Structure describing the action to be taken when a signal arrives.  */
struct sigaction
  {
    /* Signal handler.  */
#if defined __USE_POSIX199309 || defined __USE_XOPEN_EXTENDED
    union
      {
    /* Used if SA_SIGINFO is not set.  */
    __sighandler_t sa_handler;
    /* Used if SA_SIGINFO is set.  */
    void (*sa_sigaction) (int, siginfo_t *, void *);
      }
    __sigaction_handler;
# define sa_handler __sigaction_handler.sa_handler
# define sa_sigaction   __sigaction_handler.sa_sigaction
#else
    __sighandler_t sa_handler;
#endif

    /* Additional set of signals to be blocked.  */
    __sigset_t sa_mask;

    /* Special flags.  */
    int sa_flags;

    /* Restore handler.  */
    void (*sa_restorer) (void);
  };

If I'm not wrong, this is the standard: https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/bits/sigaction.h#L32-L55

But there is something strange, on my machine, there is sa_restorer field, but not on glibc source. In nim, sa_restorer has been defined in multiple different architecture.

carrexxii commented 2 weeks ago

@Alogani I have the exact same definition as you do on Arch. If you read the comment above the definition, it says that there should be a system-dependent version for the struct:

/* These definitions match those used by the 4.4 BSD kernel.
   If the operating system has a `sigaction' system call that correctly
   implements the POSIX.1 behavior, there should be a system-dependent
   version of this file that defines `struct sigaction' and the `SA_*'
   constants appropriately.  */

And I think rather check the main repos and not an old mirror to be safe, but it doesn't look like it has changed at all.

litlighilit commented 2 weeks ago

But there is something strange, on my machine, there is sa_restorer field, but not on glibc source. In nim, sa_restorer has been defined in multiple different architecture.

glibc's definition really seems to contain no sa_restorer field....

Maybe many linux distributions do not use glibc's definition directly.

Alogani commented 2 weeks ago

So instead of using basing on linux specific architecture, we should put all defs in posix common architecture and use compile switch for each distro -> https://nim-lang.org/docs/distros.html

There is also the problem of how to represent it (dicussion on the PR): https://github.com/nim-lang/Nim/pull/23705#issuecomment-2161542831

Because some systems uses a union inside a an object, naive implementation will have a wrong sizeof.

litlighilit commented 2 weeks ago

__sighandler_t sa_handler is a function pointer, so is sa_sigaction.

Isn't each function pointer in C the same size?[^nim]

(as function pointer is still pointer).

So as far as I'm concerned, despite araq's worry, a union of two pointer is still of one pointer length.

Therefore the sizeof will give the right result.

[^nim]: It has to be mentioned that Nim's proc is of nimcall convention by default, so it's two pointer length. Thus mark a proc as another convetion like noconv is required.

Alogani commented 2 weeks ago

@litlighilit

I wasn't sure of what you said, so I verified, and you are indeed right:

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
    struct sigaction action; // #Sigaction
    printf("%ld", sizeof(action));

    return 0;
}

Output = 152

type  
   Sigaction2* {.importc: "struct sigaction", 
               header: "<signal.h>", final, pure.} = object ## struct sigaction 
     sa_sigaction*: proc (x: cint, y: pointer, z: pointer) {.noconv.}
     sa_handler*: proc (x: cint) {.noconv.}
                                           ## SIG_IGN or SIG_DFL. 
     sa_mask*: Sigset ## Set of signals to be blocked during execution of 
                     ## the signal handling function. 
     sa_flags*: cint   ## Special flags. 
     sa_restorer: proc() {.noconv.}   ## not intended for application use. 

var a = Sigaction2()
echo sizeof(a)

Output = 152

Because there can be multiple different variations by distribution and only one topic is a bit messy to check that, I propose you make an issue for each different distribution here: https://github.com/Alogani/Nim/issues

Then, we could either:

litlighilit commented 2 weeks ago

Ops,

I meant either using a union inside object or defining only one function pointer field keeps the struct layout valid.

But defining two field together like above breaks the validaity of such a struct, as in your system, the two function pointers is only one pointer length.


Also, I later realized that sizeof in Nim is in fact mapped to C's sizeof so it reflects nothing about the size of a object in Nim side.

Alogani commented 2 weeks ago

Oups, true. I don't see how to do it, except by using just a pointer, but we would lose some information.

litlighilit commented 2 weeks ago

I see the SigAction struct of posix miss the field sa_sigaction*: proc (x: cint, y: ptr SigInfo, z: pointer) {.noconv.} for amd64

omg, have a look at code here

It's not missing!

litlighilit commented 2 weeks ago

So, it turns out the origin problem is not a problem.

But there are some other raised when inspecting it:

The 1st

What's more, OpenBSD's is the same as what's defined in posix_linux_amd64.nim, but in Nim its fields' order is wrong?!

I've found the relavant pr: #12137

The 2nd

As we've found glibc didn't define the sa_restorer field, then what such a thing will affect? in which Linux distribution?

Alogani commented 2 weeks ago

@litlighilit this is very nice work. The workaround using a template is a nice one!

So the only problem is to make openbsd field in good order and that's all ?

litlighilit commented 2 weeks ago

Openbsd.... I've found the relavant pr: #12137

In that pr, @euantorano mentioned:

... as I can't get detect.nim to run on OpenBSD

So it means the definitions of these structs may not be generated from header files.

Therefore, it's more likely the sa_sigacion field is placed in the wrong order (just be appended in the end).

Alogani commented 2 weeks ago

I think it would quite easy to make std/distro work at compile time. But certainly not very efficient...

The template might still be the most polyvalent solution

litlighilit commented 2 weeks ago

So the only problem is to make openbsd field in good order and that's all ?

Use the same definition as posix_linux_amd64.nim did, where we can still define templates/procs as getter and setter,

and replace the last field definition with sa_restorer

litlighilit commented 2 weeks ago

I'll make it done.

litlighilit commented 2 weeks ago

Also, I'd like to add some comments that such a field repersents a union.

litlighilit commented 2 weeks ago

Ops, I know, the wrong definition must be from posix_macos_amd64.nim #11666 ...

But refer to Apple doc, it's not right for macos either.

litlighilit commented 2 weeks ago

Here is about the 2nd problem:

The 2nd As we've found glibc didn't define the sa_restorer field, then what such a thing will affect? in which Linux distribution?

But there is something strange, on my machine, there is sa_restorer field, but not on glibc source. In nim, sa_restorer has been defined in multiple different architecture.

according to Linux's man.7:

The sa_restorer field is not intended for application use. (POSIX does not specify a sa_restorer field.)

Also, POSIX manual indices this.

a POSIX may doesn't define a sa_restorer,

... I also found there is not sa_restorer in MacOSX's manual

euantorano commented 2 weeks ago

Openbsd.... I've found the relavant pr: #12137

In that pr, @euantorano mentioned:

... as I can't get detect.nim to run on OpenBSD

So it means the definitions of these structs may not be generated from header files.

Therefore, it's more likely the sa_sigacion field is placed in the wrong order (just be appended in the end).

It may be worth trying to get the detect.nim script to run on OpenBSD again. It’s been a long time since I tried it as I’ve been out of the loop on Nim for a while.

litlighilit commented 2 weeks ago

Thanks for your reply!

I currently rely on online doc https://man.openbsd.org/sigaction.2 to amend the sigaction struct's definition.

For current issue, It'll be helpful if you can have a look at the header file about sigaction struct definition and check if it conforms what the online doc describles.

euantorano commented 2 weeks ago

Thanks for your reply!

I currently rely on online doc https://man.openbsd.org/sigaction.2 to amend the sigaction struct's definition.

For current issue, It'll be helpful if you can have a look at the header file about sigaction struct definition and check if it conforms what the online doc describles.

Sure, I’ll check the header file on my machine tonight when I’m back at home.

Alogani commented 2 weeks ago

Nice work @litlighilit 👍