ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
32.78k stars 2.39k forks source link

`std.posix` defines non-posix symbols #20563

Open arminfriedl opened 2 weeks ago

arminfriedl commented 2 weeks ago

I think there are symbols defined in std.posix that are not actually defined (as such) by POSIX. I used the standards documents available at [1] for this.

The definition that started this was

https://github.com/ziglang/zig/blob/854e86c5676de82bc46b5c13a0c9c807596e438d/lib/std/posix.zig#L134

Should this be renamed to std.posix.in_port_t? I couldn't find any mention of port_t in the POSIX standard. The closest is in_port_t which should be defined in the arpa/inet.h and netinet/in.h headers. port_t itself seems Solaris specific, but not POSIX.

In a similar vein there are https://github.com/ziglang/zig/blob/854e86c5676de82bc46b5c13a0c9c807596e438d/lib/std/posix.zig#L135-L136

Both of which again seem to be exist somehow in Solaris, however they do not appear to be defined in POSIX.

There might be others too.

I dunno if this was done on purpose or something, but I guess these (and other similar ones) should be renamed/removed from std.posix? The straight-forward solution is a breaking change though, obviously.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/download/index.html

arminfriedl commented 2 weeks ago

Oh and since this is mentioned in the contributor guidelines to be relevant: I came across this while working on a Zig project https://github.com/arminfriedl/unclog. When using port_t type the build fails on Linux with glibc dynamically linked [1]. Which ofc could very well be because glibc isn't posix compliant here, but in this case I believe it might not be glibc's fault.

I'm also happy to try to contribute with code. I'd just need some help/pointers if and how to proceed.

[1] I figure the same happens with musl and everything other than Solaris too though

andrewrk commented 1 week ago

would you feel better if it was named std.unix instead?

ifreund commented 1 week ago

I personally would much prefer std.unix, this has come up many times on the IRC channel since the move from std.os to std.posix.

Writing std.posix.epoll for example just feels fundamentally wrong to me :D

arminfriedl commented 1 week ago

would you feel better if it was named std.unix instead?

Not really. (on a side note, I don't think this is about just feelings 😉).

For the port_* parts specifically, I think it would be best to keep it in solaris.zig only. It seems to be part of the Event Completion Framework. It is neither POSIX nor Unix imho and given the adoption of Solaris I figure it can't be considered unix standard "in practice".

Added a PR with a suggestion. This only addresses the port/Event Completion Framework part of Solaris. I guess there are other similar issues (e.g. same arguments and approach probably applies to kqueue).

andrewrk commented 4 days ago

port_t is used by solaris, illumos, and macos. So it's exposed in the unixy/posixy std lib API layer, where it can be shared by those three operating systems.

I don't really understand what this issue is trying to accomplish. What is the problem statement?

Edit: I think I see a path to resolution. Please see the pattern inside std.c that is used for extern functions. If these APIs are shared with other operating systems, they can use the pattern of switching on native_os. Otherwise they can use the pattern near the bottom of the file to import directly from solaris.zig, and indeed have the definitions live inside that file to indicate where they are from.

arminfriedl commented 4 days ago

I don't really understand what this issue is trying to accomplish. What is the problem statement?

Originally, it was just a symbol in posix.zig which didn't seem to be part of POSIX. That seemed odd/wrong to me. Elevated by the fact that port_t is there but in_port_t (which is actually POSIX) isn't. So I figured something is wrong - asked on Discord, they also seemed to agree, hence the issue :)

Then it came to the unix layer discussion and maybe I'm just misunderstanding the purpose of these layers. So my understanding of these common layers (posix-y, unix-y,..) was that by using any declarations/definitions there I would be able to compile against any reasonably posix-y/unix-y OS without changes or special-casing in my code. I.e. as long as I stick to only these layers I don't need to bother with OS-specifics too much - same code compiles and runs everywhere basically the same.

So the problem then is: For port_t this seems not the case. It seems very specific to Solaris (and solaris derivates, you're right):

Maybe I just saw these layers as an intersection of what is common across all the posix-y/unix-y OS, whereas they are more meant as a combination of what that exists in any of the posix-y/unix-y OS - shared (semantically or lexically) to varying degrees.

I think I see a path to resolution

Yes if I understand this correctly that seems to be a good way to me, making things more explicit :+1:

andrewrk commented 3 days ago

It sounds like you understand what the std lib is going for now. Admittedly the naming of std.posix is not great. I think it was also not great as std.os. You can see why we are considering std.unix now.

Naming aside, I do think the abstraction layer makes sense in a conceptual sense. It's well defined what belongs in there and what does not.