ClusterLabs / libqb

libqb is a library providing high performance logging, tracing, ipc, and poll.
http://clusterlabs.github.io/libqb/
GNU Lesser General Public License v2.1
166 stars 98 forks source link

Alignment on own struct types violated by ringbuffer #483

Open ThomasLamprecht opened 1 year ago

ThomasLamprecht commented 1 year ago

Some libqb structs enforce specific alignment like 8 bytes for struct qb_ipc_request_header, but it seems that not all the code path handling such a struct enforces that, at least when using the shared memory variant, and the library can end up passing unaligned pointers around, e.g., to the msg_process callback, resulting in undefined behavior when accessing its struct members.

As example use the following truncated msg_process callback:

static int32_t s1_msg_process_fn(
    qb_ipcs_connection_t *c,
    void *data,
    size_t size
) {
    struct qb_ipc_request_header *req_pt =  (struct qb_ipc_request_header *) data;
    int32_t request_id = req_pt->id;
    int32_t request_size = req_pt->size;
   // ...
    return 0;
}

Which is exercised on the client side by doing something along the lines of

int iov_len = 2;
struct iovec iov[iov_len];

struct qb_ipc_request_header req_header = {
    .id = msgid,
    .size = sizeof(struct qb_ipc_request_header) + len,
};

iov[0].iov_base = (char *)&req_header;
iov[0].iov_len = sizeof(req_header);
iov[1].iov_base = dataptr;
iov[1].iov_len = len;

int32_t timeout = -1;
int res = qb_ipcc_sendv_recv(conn, iov, iov_len, reply_buffer, sizeof(reply_buffer), timeout);

Just for the record: The cast struct enforces 8byte alignment on members and itself:

struct qb_ipc_request_header {
        int32_t id __attribute__ ((aligned(8)));
        int32_t size __attribute__ ((aligned(8)));
} __attribute__ ((aligned(8)));

Compiling above with ASan & (for this issue the actual relevant) UBSan, e.g. for a common Makefile it'd look something like:

CFLAGS += -fsanitize=address,undefined
LDFLAGS += -fsanitize=address,undefined -static-libasan -static-libubsan

one then sometimes (not always!) gets an member access within misaligned address error reported:

server.c:184:10: runtime error: member access within misaligned address 0x7f4f9d0a6094 for type 'struct qb_ipc_request_header', which requires 8 byte alignment
0x7f4f9d0a6094: note: pointer points here
  a1 a1 a1 a1 06 00 00 00  00 00 00 00 21 00 00 00  00 00 00 00 68 61 2f 72  65 73 6f 75 72 63 65 73
              ^ 
server.c:185:10: runtime error: member access within misaligned address 0x7f4f9d0a6094 for type 'struct qb_ipc_request_header', which requires 8 byte alignment
0x7f4f9d0a6094: note: pointer points here
  a1 a1 a1 a1 06 00 00 00  00 00 00 00 21 00 00 00  00 00 00 00 68 61 2f 72  65 73 6f 75 72 63 65 73
              ^ 

After looking through the whole call chain and checking how our side and libqb sides handles pointers, it seems that misalignment comes from the libqb ring buffer implementation, namely qb_rb_chunk_step which steps the write pointer always by sizeof(uint32_t), i.e., four bytes, breaking the alignment rules of the qb_ipc_request_header struct.

If the base alignment of most (all?) public types really should be 8 bytes then we could simply increment the write_puffer always by 8 bytes.

But IMO, the 8 byte alignment is rather odd in the first place and might waste quite a bit of (CPU cache) space, albeit I did not really analyze that part closely. So, is there an actual reason for keeping those? Besides ABI breakage causing churn, naturally.

chrissie-c commented 1 year ago

I've had a close look at this. It seems to be that the messages in the ringbuffer are held as an array of uint32 and aligned to 1 of those (slightly pointlessly imho). To align them to 2 of those would break the IPC 'on-wire' so a server linked with a 1-aligned libqb can't talk to a new 2-aligned client, and vice-versa.

So, given the level of breakage it would cause for a problem that's not really much of a practical issue (unless you know differently) I'm inclined to leave this until (if) we do a new major version of libqb.

ThomasLamprecht commented 1 year ago

But how about replacing the alignment attributes with explicit padding struct members as stop gap, i.e.:

struct qb_ipc_request_header {
        int32_t id;
        int32_t padding0;
        int32_t size;
        int32_t padding1;
};

That way it would stay compatible, the misalignment access would be gone from POV of UBSan.

So, given the level of breakage it would cause for a problem that's not really much of a practical issue (unless you know differently)

From top of my head the practical impact probably isn't really big, but it might affect:

So, no hard feelings from my side, and keeping this as is and open for now would be OK. But, if there's no issue with above suggestion with the padding fields it could be an acceptable workaround for now.

chrissie-c commented 1 year ago

I'm certainly not against doing that if it will help you.

I don't think it fixes the fundamental problem, which is the alignment of data within the ringbuffer itself. But if your aim is simply to get rid of that message in your testing, then I'm happy to take a PR for it.