ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.74k stars 2.48k forks source link

translate-c misses an alignment cast in translated C code #21022

Open AlexMax opened 1 month ago

AlexMax commented 1 month ago

Zig Version

0.14.0-dev.909+f9f894200

Steps to Reproduce and Observed Behavior

I ran this C code (from SMhasher) through zig translate-c:

#include <stdint.h>

uint32_t
FNV1A_Totenschiff(const char *key, int len, uint32_t seed)
{
#define _PADr_KAZE(x, n) (((x) << (n)) >> (n))

  const char *p = (char *)key;
  const uint32_t PRIME = 591798841;
  uint32_t hash32;
  uint64_t hash64 = (uint64_t)seed ^ UINT64_C(14695981039346656037);
  uint64_t PADDEDby8;

  for (; len > 8; len -= 8, p += 8)
  {
    PADDEDby8 = *(uint64_t *)(p + 0);
    hash64 = (hash64 ^ PADDEDby8) * PRIME;
  }

  // Here len is 1..8. when (8-8) the QWORD remains intact
  PADDEDby8 = _PADr_KAZE(*(uint64_t *)(p + 0), (8 - len) << 3);
  hash64 = (hash64 ^ PADDEDby8) * PRIME;

  hash32 = (uint32_t)(hash64 ^ (hash64 >> 32));
  return hash32 ^ (hash32 >> 16);
#undef _PADr_KAZE
}

This results in the following Zig function - trimming the bits outside the function:

pub export fn FNV1A_Totenschiff(arg_key: [*c]const u8, arg_len: c_int, arg_seed: u32) u32 {
    var key = arg_key;
    _ = &key;
    var len = arg_len;
    _ = &len;
    var seed = arg_seed;
    _ = &seed;
    var p: [*c]const u8 = @as([*c]u8, @ptrCast(@volatileCast(@constCast(key))));
    _ = &p;
    const PRIME: u32 = @as(u32, @bitCast(@as(c_int, 591798841)));
    _ = &PRIME;
    var hash32: u32 = undefined;
    _ = &hash32;
    var hash64: u64 = @as(u64, @bitCast(@as(c_ulonglong, seed))) ^ @as(c_ulonglong, 14695981039346656037);
    _ = &hash64;
    var PADDEDby8: u64 = undefined;
    _ = &PADDEDby8;
    while (len > @as(c_int, 8)) : (_ = blk: {
        len -= @as(c_int, 8);
        break :blk blk_1: {
            const ref = &p;
            ref.* += @as(usize, @bitCast(@as(isize, @intCast(@as(c_int, 8)))));
            break :blk_1 ref.*;
        };
    }) {
        PADDEDby8 = @as([*c]u64, @ptrCast(@volatileCast(@constCast(p + @as(usize, @bitCast(@as(isize, @intCast(@as(c_int, 0))))))))).*;
        hash64 = (hash64 ^ PADDEDby8) *% @as(u64, @bitCast(@as(c_ulonglong, PRIME)));
    }
    PADDEDby8 = (@as([*c]u64, @ptrCast(@volatileCast(@constCast(p + @as(usize, @bitCast(@as(isize, @intCast(@as(c_int, 0))))))))).* << @intCast((@as(c_int, 8) - len) << @intCast(3))) >> @intCast((@as(c_int, 8) - len) << @intCast(3));
    hash64 = (hash64 ^ PADDEDby8) *% @as(u64, @bitCast(@as(c_ulonglong, PRIME)));
    hash32 = @as(u32, @bitCast(@as(c_uint, @truncate(hash64 ^ (hash64 >> @intCast(32))))));
    return hash32 ^ (hash32 >> @intCast(16));
}

Trying to compile this function as zig code gives me an error:

fnv.zig:26:34: error: @ptrCast increases pointer alignment
        PADDEDby8 = @as([*c]u64, @ptrCast(@volatileCast(@constCast(p + @as(usize, @bitCast(@as(isize, @intCast(@as(c_int, 0))))))))).*;
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fnv.zig:26:70: note: '[*c]const u8' has alignment '1'
        PADDEDby8 = @as([*c]u64, @ptrCast(@volatileCast(@constCast(p + @as(usize, @bitCast(@as(isize, @intCast(@as(c_int, 0))))))))).*;
                                                                   ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fnv.zig:26:34: note: '[*c]u64' has alignment '8'
fnv.zig:26:34: note: use @alignCast to assert pointer alignment
referenced by:
    FNV1A_Totenschiff: fnv.zig:1:5

This seems like an accurate report - however since the original code seemingly doesn't care about alignment concerns, I would think that translate-c would add an alignment cast to the translated code.

Expected Behavior

translate-c should add an alignment cast in the spot pointed out by the error, and the function should compile successfully.

truemedian commented 1 month ago

An alignCast sounds wrong here, because p absolutely has no guarantee to be 8 byte aligned. The more correct solution would be cast to a *align(1) u64, but im not sure translate-c has the information required to know that.

mlugg commented 1 month ago

Using @alignCast would be technically correct, since standard C simply makes creating underaligned pointers UB.

ISO/IEC 9899:1999 § 6.3.2.3(7):

A pointer to an object or an incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for pointed-to type, the behavior is undefined.

It's probably more useful to add an align annotation. I'm not certain about how possible this is, and whether we'd be able to omit it in common cases (where the alignment requirement of the new pointee is trivially no greater than that of the old pointee).