Stop assuming ndt/xnd struct memory layout

saulshanabrook commented 6 years ago

Currently, I am compiling the existing xnd/ndtypes code with clang and looking at the LLVM that is outputted to see how what the structs look like. However, as I was looking through the ndtypes translation, I realized that LLVM has no support for unions. They are just stored as their biggest types, and then it uses getelementptr to grab bits of it and cast it. See https://stackoverflow.com/a/19550613/907060

I think that means we shouldn't depend on how xnd_t or ndt_t are laid out in memory. For example, if we try to grab the xnd_t struct off of the python object, we shouldn't depend on it looking like this:

i8, i16, i32, i64 = map(ir.IntType, [8, 16, 32, 64])
index = lambda i: ir.Constant(i32, i)

context = ir.global_context

ndt_slice_t = context.get_identified_type("ndt_slice_t")
ndt_slice_t.set_body(i64, i64, i64)

ndt_t = context.get_identified_type("_ndt")
ndt_t.set_body(
    i32, i32, i32, i32, i64, i16,
    struct([struct([i64, i64, i64, ptr(ptr(ndt_t))])]),
    struct([struct([struct([i32, i64, i32, ptr(i32), i32, ptr(ndt_slice_t)])])]),
    ir.ArrayType(i8, 16)
)

xnd_bitmap_t = context.get_identified_type("xnd_bitmap")
xnd_bitmap_t.set_body(
    ptr(i8),
    i64,
    ptr(xnd_bitmap_t)
)

xnd_t = context.get_identified_type("xnd")
xnd_t.set_body(
    xnd_bitmap_t,
    i64,
    ptr(ndt_t),
    ptr(i8)
)

So... Where does that leave us. What does know about the layout of those structures? The compiled libxnd/libndtypes shared libraries that are on the users system. Since those libraries created them.

So we should just be calling the C functions defined in those libraries to access parts of the structs.

pearu commented 6 years ago

I had the same experience about C union via llvm. Consider

#include "xnd.h"
int main() {
  ndt_t t;
  xnd_t x;
}

Using clang -c -S main.c -emit-llvm -I<conda-prefix>/envs/numba-xnd/include/ leads to

%struct._ndt = type { i32, i32, i32, i32, i64, i16, %union.anon, %struct.anon.18, [0 x i8] }
%union.anon = type { %struct.anon.0 }
%struct.anon.0 = type { i64, i64, i64, %struct._ndt** }
%struct.anon.18 = type { %union.anon.19 }
%union.anon.19 = type { %struct.anon.21 }
%struct.anon.21 = type { i32, i64, i32, i32*, i32, %struct.ndt_slice_t* }
%struct.ndt_slice_t = type { i64, i64, i64 }
%struct.xnd = type { %struct.xnd_bitmap, i64, %struct._ndt*, i8* }
%struct.xnd_bitmap = type { i8*, i64, %struct.xnd_bitmap* }
; Function Attrs: nounwind uwtable
define i32 @main() #0 {
  %t = alloca %struct._ndt, align 16
  %x = alloca %struct.xnd, align 8
  ret i32 0
}

pearu commented 6 years ago

Note that

sizeof(ndt_t)=112
sizeof(xnd_t)=48

Not sure if realizable, I am thinking of representing ndt_t and xnd_t as a sequence of 112 and 48 bytes, respecitvely. Within LLVM, these could be of vector type or of wide integer type. Otherwise, also struct {i32 i32 ..} can be an option.

saulshanabrook commented 6 years ago

You can also have opaque structures in LLVM. It just means you can't allocate them or get elements in them. Which would actually be fine, as long as we have C functions that do those things.

saulshanabrook commented 6 years ago

I am not very sure about something. Is memory layout for C structures known for sure? For example, this struct:

typedef struct {
    int ndim;
    int64_t itemsize;
    int64_t shape[NDT_MAX_DIM];
    int64_t strides[NDT_MAX_DIM];
    int64_t steps[NDT_MAX_DIM];
} ndt_ndarray_t;

Translates to this LLVM:

%struct.ndt_ndarray_t = type { i32, i64, [128 x i64], [128 x i64], [128 x i64] }

Which makes sense. But can we assume that to always be true? Or could that same C be compiled in some way that it is laid out in memory differently than that? Maybe by changing the padding or something? Or does that always result in the same LLVM IR?

saulshanabrook commented 6 years ago

Like the first int type in the struct is translated as an i32, but this isn't guaranteed by C, right? It could be an i64? So we really can't assume this LLVM struct tthen for ndt_ndarray_t. Would the portable way to implement this then be to treat it as an opaque pointer and do all the struct getting/creation in C?

But this is what confuses me. When you are compiling a C program that depends on ndtypes, you just include the header file. So when you are compiling it, you don't know how ndt_ndarray_t will be laid out in memory, because you don't know how ndtypes was compiled. So how can you write a function that accesses elements of it? Or how can the compiler know how big it will be?

pearu commented 6 years ago

IIRC, llvm generated object code can be linked together with compiler (gcc) generated object code and the integral result can be optimized using llvm.. While there are compilers with their peculiarities, if we get it working with gcc, that would be a good start. Also, we need unittests that test this assumption.

saulshanabrook commented 6 years ago

Not sure if realizable, I am thinking of representing ndt_t and xnd_t as a sequence of 112 and 48 bytes, respectively. Within LLVM, these could be of vector type or of wide integer type.

I think this is the right approach. I created a little script that just prints the size of the structs we need. We can hardcode those in Python and use them to allocate the right size structs. I will push that soon to #3 when I have it working.

saulshanabrook commented 6 years ago

Closing this since we are using @pearu's xndtools to keep memory layout assumptions out of numba.

xnd-project / numba-xnd

Stop assuming ndt/xnd struct memory layout #2