ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.84k stars 2.55k forks source link

`E.PERM` should not be an unexpected error for `std.posix.socket` #20524

Open kj4tmp opened 4 months ago

kj4tmp commented 4 months ago

Zig Version

0.14.0-dev.66+1fdf13a14

Steps to Reproduce and Observed Output

this test outputs the following when not run with sudo:

error: 'main.test.socket permissions error' failed: unexpected errno: 1

errno 1 corresponds to E.PERM (permission denied when creating socket)

This errno is expected because creating a raw socket generally requires elevated privileges.

test "socket permissions error" {
    const ETH_P_ETHERCAT: u16 = 0x88a4;
    const socket_result = std.posix.socket(
        std.posix.AF.PACKET,
        std.posix.SOCK.RAW,
        std.mem.nativeToBig(u32, ETH_P_ETHERCAT),
    );
    try std.testing.expect(socket_result == std.posix.SocketError.PermissionDenied);
}

this is because std.posix.socket does not expect .PERM: https://github.com/ziglang/zig/blob/master/lib/std/posix.zig#L3519

    switch (errno(rc)) {
        .SUCCESS => {
            const fd: fd_t = @intCast(rc);
            errdefer close(fd);
            if (!have_sock_flags) {
                try setSockFlags(fd, socket_type);
            }
            return fd;
        },
            // <------------------ .PERM is missing here!
        .ACCES => return error.PermissionDenied,
        .AFNOSUPPORT => return error.AddressFamilyNotSupported,
        .INVAL => return error.ProtocolFamilyNotAvailable,
        .MFILE => return error.ProcessFdQuotaExceeded,
        .NFILE => return error.SystemFdQuotaExceeded,
        .NOBUFS => return error.SystemResources,
        .NOMEM => return error.SystemResources,
        .PROTONOSUPPORT => return error.ProtocolNotSupported,
        .PROTOTYPE => return error.SocketTypeNotSupported,
        else => |err| return unexpectedErrno(err),
    }
}

when .PERM is is actually expected.

So the switch should probably have the following in it:

.PERM => return error.PermissionDenied,

Expected Output

we expect std.posix.socket to return error.PermissionDenied when the syscall returns E.PERM.

arminfriedl commented 4 months ago

I'd rather vote for

.PERM => return error.OperationNotPermitted

From POSIX errno.h [1]

[EACCES]
    Permission denied.
[EPERM]
    Operation not permitted.

GNU glibc [2]


Macro: int EPERM

    “Operation not permitted.” Only the owner of the file (or other resource) or processes with special privileges can perform the operation. 

Macro: int EACCES

    “Permission denied.” The file permissions do not allow the attempted operation. 

Similar but distinct imho.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/download/index.html [2] https://sourceware.org/glibc/manual/latest/html_node/Error-Codes.html

arminfriedl commented 4 months ago

On second look EPERM might actually be unexpected strictly speaking. POSIX socket() is only defined to return

ERRORS

    The socket() function shall fail if:

    [EAFNOSUPPORT]
        The implementation does not support the specified address family.
    [EMFILE]
        All file descriptors available to the process are currently open.
    [ENFILE]
        No more file descriptors are available for the system.
    [EPROTONOSUPPORT]
        The protocol is not supported by the address family, or the protocol is not supported by the implementation.
    [EPROTOTYPE]
        The socket type is not supported by the protocol.

    The socket() function may fail if:

    [EACCES]
        The process does not have appropriate privileges.
    [ENOBUFS]
        Insufficient resources were available in the system to perform the operation.
    [ENOMEM]
        Insufficient memory was available to fulfill the request.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html

leecannon commented 4 months ago

2.3 Error Numbers:

Implementations may support additional errors not included in this list, may generate errors included in this list under circumstances other than those described here, or may contain extensions or limitations that prevent some errors from occurring.

By the spec no error can be considered unexpected, as an implementation can return any error it wants and still be POSIX compliant.

arminfriedl commented 4 months ago

One question this raises for me at least then is if that is good at all

        else => |err| return unexpectedErrno(err)

Or, alternatively, if unexpectedErrno in that context simply means conventionally unexpected, but doesn't imply non-compliance per-se, where to draw the line then. I.e. is it good to add .PERM but not e.g. .NOTSUP (or some other) then.

I'm only speaking as someone using Zig btw. I have zero authority on Zig or the standard lib. I just happen to be also using std.posix and dealing with sockets in some personal projects.

lsculv commented 1 month ago

On second look EPERM might actually be unexpected strictly speaking

The relevent line in socket(2) on Linux is:

Other errors may be generated by the underlying protocol modules.

The posix spec cannot be relied upon if you're trying to determine what errors a call returns in practice; only the OS documentation can be relied on and even then its sometimes murky. On Linux the problem is the aforementioned line which makes it hard to easily know what set of errors any socket related function can return, MacOS socket(2) has a similar line:

If a new protocol family is defined, the socreate process is free to return any desired error code. The socket() system call will pass this error code along (even if it is undefined).

Manpages like bind(2) which don't make clear that some protocols extend the potential set of returned errors are actually wrong on Linux. bind can return errors that are not listed in the manpage but listed based on the protocol; I demonstrate this below with ENODEV. Relying on any of the manpages here is a little bit of a minefield, not only because the Linux manpages are effectively wrong, but because the std.posix socket-related functions only distinguish between windows and non-windows when making the underlying system call. Pretty much all the unix-likes differ on what they say they return from socket: the MacOS manpage is different from the Linux manpage is different from the FreeBSD manpage. It would make sense, then, to have a more fine grained return switch over the os that allows for distinguishing these cases.

Anyway, more about this specific error: In this case you are using SOCK_RAW which specifies these error codes in raw(7):

ERRORS
       EACCES User tried to send to a broadcast address without having the broadcast flag set on the socket.

       EFAULT An invalid memory address was supplied.

       EINVAL Invalid argument.

       EMSGSIZE
              Packet too big.  Either Path MTU Discovery is enabled (the IP_MTU_DISCOVER socket flag) or the packet size exceeds the  maximum  allowed  IPv4
              packet size of 64 kB.

       EOPNOTSUPP
              Invalid flag has been passed to a socket call (like MSG_OOB).

       EPERM  The user doesn't have permission to open raw sockets.  Only processes with an effective user ID of 0 or the CAP_NET_RAW attribute may do that.

       EPROTO An ICMP error has arrived reporting a parameter problem.

So, the additional errors we get are EFAULT, EOPNOTSUPP, EPROTO and the relevant EPERM.

This means you would need to look at the relevant AF_* protocol family manpage error section to know all of the errors that a given protocol could return. A small problem is that the error sections do not distinguish between errors that can happen when you call socket() vs when you call bind(), or send() for example. This would have to be fixed for each socket-related posix function for each protocol. I am not aware of any exhaustive list of the errors that each protocol can return for each function, so this process might be tedious. Just as a quick example I also found that the AF_PACKET protocol family can return ENODEV when calling bind, example (you need elevated privileges for creating the packet socket):

#include <errno.h>
#include <linux/if_packet.h>
#include <net/ethernet.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>

int main(void) {
    int sockfd = socket(AF_PACKET, SOCK_DGRAM, htons(ETH_P_ALL));
    if (sockfd == -1) {
        int err = errno;
        fprintf(stderr, "Could not create socket: %s (OS error %d)\n", strerror(err), err);
        exit(1);
    }

    struct sockaddr_ll addr;
    memset(&addr, 0, sizeof(addr));

    addr.sll_family = AF_PACKET;
    addr.sll_protocol = htons(ETH_P_ALL);
    addr.sll_ifindex = 0xFFFF; /* An invalid interface index */

    int err = bind(sockfd, (struct sockaddr*)&addr, sizeof(addr));
    if (err) {
        int err = errno;
        fprintf(stderr, "Could not bind: %s (OS error %d)\n", strerror(err), err);
        exit(1);
    }

    return 0;
}

(This is on Linux)

When you call Zig's std.posix.bind, ENODEV is not an expected errno: https://github.com/ziglang/zig/blob/d83a3f1746c81026d1cf0244156513c9b5a2a9f6/lib/std/posix.zig#L3698C1-L3715C55 In general it should be expected there are more cases like this.

So, I see basically four solutions:

  1. Keep it the way it is where a function's expected errno's are only those explicity listed in the manpage.
    • This might require changing the returns from only checking for posix vs windows as different unix-like operating systems have different manpages that list different errors. For example the FreeBSD manpages do list EPERM as a potential return value of socket(). Linux manpages cannot be assumed to be "what all unix-likes do" nor can the posix spec nor any other unix-like.
  2. Do away with the concept of an "unexpected errno" entirely.
    • @leecannon cites the relevant posix spec that would support this option.
    • It is not possible to easily know what docs are reliable vs. unreliable, the Linux bind(2) docs are wrong, they do not say that ENODEV can be returned, nor do they say protocols can extend the returns, yet ENODEV still gets returned. There may be other "gotchas" like this for other posix things, so this would be a potentially safer option.
  3. Make the expected errnos the union of the base function call and all the errors listed the protocol manpage.
    • This means another switch on the family parameter of socket() for example.
    • A strange side-effect of this would be that some nonsense errnos that could likely never actually be returned become "expected". For example, socket() could now return EADDRINUSE because that is an error listed in the ip(7) manpage, but that error obviously makes no sense.
  4. Go through and verify (or find some yet-unknown-to-me resource that verifies) each call for each protocol for each OS and add the needed error values to the Zig implementation of the function.
    • This is probably the "nicest" option, but is also potentially a significant and tedious amount of work.
lsculv commented 1 month ago

Here's the bind on AF_PACKET leading to ENODEV example above in a Zig test:

const std = @import("std");
const posix = std.posix;
const mem = std.mem;
const testing = std.testing;

test "ENODEV on bind" {
    const ETH_P_ALL: u16 = 3;
    const sockfd = try posix.socket(posix.AF.PACKET, posix.SOCK.DGRAM, mem.nativeToBig(u32, ETH_P_ALL));

    var addr = mem.zeroes(posix.sockaddr.ll);

    addr.family = posix.AF.PACKET;
    addr.protocol = mem.nativeToBig(u16, ETH_P_ALL);
    addr.ifindex = 0xFFFF; // An invalid interface index

    // This should give ENODEV
    const err = posix.bind(sockfd, @ptrCast(&addr), @sizeOf(@TypeOf(addr)));
    // But this case is not handled in `std.posix.bind`
    try testing.expectError(error.NoDevice, err);
}

Running this with sudo zig test <FILENAME> you should be able to see the backtrace from the unexpectedErrno function being called, which will show unexpected errno: 19, corresponding to ENODEV.