robertwb / issues-import-test

0 stars 0 forks source link

Buffers: Clean up internal representation of acquired buffers #453

Open robertwb opened 15 years ago

robertwb commented 15 years ago

For speed purposes, some information (shape, and strides and suboffsets when applicable) is unpacked into local variables when acquiring a buffer.

The rest of the Cython codebase is (naturally) centered around "one variable Cython side is one variable C side". For instance temporary expressions must store their result in a single temporary variable C-side (which is acquired through code.funcstate.allocate_temp, Code.py).

Using structs instead of loose variables for unpacking buffers would remedy this problem, while keeping the necesarry unpacked variables on the stack (so it should compile to as optimal code).

E.g. for strided 1D mode, one would rather than

PyObject* __pyx_v_myvar
Py_buffer __pyx_buf_myvar
Py_ssize_t __pyx_bufshape0_myvar
Py_ssize_t __pyx_bufstride0_myvar

use a single struct like this:

typedef struct {
  Py_buffer bufinfo;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

typedef struct {
  __Pyx_StridedBuf_1D buffer;
  PyObject* object;
} __Pyx_StridedBuf_1D_Obj;

// in function...:

__Pyx_StridedBuf_1D_Obj __pyx_v_myvar;

The reason to split into a struct without and one with object, is to make it easier to implement non-object-linked buffers later (e.g. int[:]).

In addition to simply changing the variable allocation scheme (in Buffer.py), one would then need to output __pyx_v_myvar.object rather than __pyx_v_myvar everywhere the Python object is needed; this can rather easily be done in NameNode.

Migrated from http://trac.cython.org/ticket/299

robertwb commented 15 years ago

@dagss changed In addition to {{NameNode, various cleanup code (exception handling etc.) would need to be taught using__pyx_v_myvar.objectinstead for theirPy_DECREF```.

robertwb commented 15 years ago

@dagss changed Note that the object field present in Py_buffer is filled in by the getbuffer implementation in the object in question, and so we should not rely on it for this purpose, but must store our own object reference.

robertwb commented 15 years ago

@dagss changed description from

For speed purposes, some information (shape, and strides and suboffsets when applicable) is unpacked into local variables when acquiring a buffer.

The rest of the Cython codebase is (naturally) centered around "one variable Cython side is one variable C side". For instance temporary expressions must store their result in a single temporary variable C-side (which is acquired through code.funcstate.allocate_temp, Code.py).

Using structs instead of loose variables for unpacking buffers would remedy this problem, while keeping the necesarry unpacked variables on the stack (so it should compile to as optimal code).

E.g. for strided 1D mode, one would rather than

PyObject* __pyx_v_myvar
Py_buffer __pyx_buf_myvar
Py_ssize_t __pyx_bufshape0_myvar
Py_ssize_t __pyx_bufstride0_myvar

use a single struct like this:

typedef struct {
  Py_buffer bufinfo;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

typedef struct {
  __Pyx_StridedBuf_1D buffer;
  PyObject* object;
} __Pyx_StridedBuf_1D_Obj;

The reason to split into a struct without and one with object, is to make it easier to implement non-object-linked buffers later (e.g. int[addition to simply changing the variable allocation scheme (in Buffer.py), one would then need to output __pyx_v_myvar.object rather than __pyx_v_myvar everywhere the Python object is needed; this can rather easily be done in NameNode. to

For speed purposes, some information (shape, and strides and suboffsets when applicable) is unpacked into local variables when acquiring a buffer.

The rest of the Cython codebase is (naturally) centered around "one variable Cython side is one variable C side". For instance temporary expressions must store their result in a single temporary variable C-side (which is acquired through code.funcstate.allocate_temp, Code.py).

Using structs instead of loose variables for unpacking buffers would remedy this problem, while keeping the necesarry unpacked variables on the stack (so it should compile to as optimal code).

E.g. for strided 1D mode, one would rather than

PyObject* __pyx_v_myvar
Py_buffer __pyx_buf_myvar
Py_ssize_t __pyx_bufshape0_myvar
Py_ssize_t __pyx_bufstride0_myvar

use a single struct like this:

typedef struct {
  Py_buffer bufinfo;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

typedef struct {
  __Pyx_StridedBuf_1D buffer;
  PyObject* object;
} __Pyx_StridedBuf_1D_Obj;

The reason to split into a struct without and one with object, is to make it easier to implement non-object-linked buffers later (e.g. int:.

In)).

In addition to simply changing the variable allocation scheme (in Buffer.py), one would then need to output __pyx_v_myvar.object rather than __pyx_v_myvar everywhere the Python object is needed; this can rather easily be done in NameNode.

robertwb commented 15 years ago

@dagss changed description from

For speed purposes, some information (shape, and strides and suboffsets when applicable) is unpacked into local variables when acquiring a buffer.

The rest of the Cython codebase is (naturally) centered around "one variable Cython side is one variable C side". For instance temporary expressions must store their result in a single temporary variable C-side (which is acquired through code.funcstate.allocate_temp, Code.py).

Using structs instead of loose variables for unpacking buffers would remedy this problem, while keeping the necesarry unpacked variables on the stack (so it should compile to as optimal code).

E.g. for strided 1D mode, one would rather than

PyObject* __pyx_v_myvar
Py_buffer __pyx_buf_myvar
Py_ssize_t __pyx_bufshape0_myvar
Py_ssize_t __pyx_bufstride0_myvar

use a single struct like this:

typedef struct {
  Py_buffer bufinfo;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

typedef struct {
  __Pyx_StridedBuf_1D buffer;
  PyObject* object;
} __Pyx_StridedBuf_1D_Obj;

The reason to split into a struct without and one with object, is to make it easier to implement non-object-linked buffers later (e.g. int[addition to simply changing the variable allocation scheme (in Buffer.py), one would then need to output __pyx_v_myvar.object rather than __pyx_v_myvar everywhere the Python object is needed; this can rather easily be done in NameNode. to

For speed purposes, some information (shape, and strides and suboffsets when applicable) is unpacked into local variables when acquiring a buffer.

The rest of the Cython codebase is (naturally) centered around "one variable Cython side is one variable C side". For instance temporary expressions must store their result in a single temporary variable C-side (which is acquired through code.funcstate.allocate_temp, Code.py).

Using structs instead of loose variables for unpacking buffers would remedy this problem, while keeping the necesarry unpacked variables on the stack (so it should compile to as optimal code).

E.g. for strided 1D mode, one would rather than

PyObject* __pyx_v_myvar
Py_buffer __pyx_buf_myvar
Py_ssize_t __pyx_bufshape0_myvar
Py_ssize_t __pyx_bufstride0_myvar

use a single struct like this:

typedef struct {
  Py_buffer bufinfo;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

typedef struct {
  __Pyx_StridedBuf_1D buffer;
  PyObject* object;
} __Pyx_StridedBuf_1D_Obj;

// in function...:

__Pyx_StridedBuf_1D_Obj __pyx_v_myvar;

The reason to split into a struct without and one with object, is to make it easier to implement non-object-linked buffers later (e.g. int:.

In)).

In addition to simply changing the variable allocation scheme (in Buffer.py), one would then need to output __pyx_v_myvar.object rather than __pyx_v_myvar everywhere the Python object is needed; this can rather easily be done in NameNode.

robertwb commented 15 years ago

@dagss changed '''Note 1'''

Note that in combination with http://cython.trac.org/ticket/311, this looks like

typedef struct {
  __Pyx_buffer* bufinfo;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

One would copy it by value, and in doing so modify the reference count inside bufinfo. When the bufinfo refcount hits 0, one would release the buffer.

'''Note 2''' Furthermore, doing efficient slicing (http://cython.trac.org/ticket/178) requires moving the base pointer. E.g. with arr[10:] the strides are not affected, only the base pointer. This might be an argument in favor of storing the buffer pointer itself by value, making it

typedef struct {
  __Pyx_buffer* bufinfo;
  char* data;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

'''Note 3''' Finally, I'm not sure how much of an issue it is to have some unecessarry fields here and there. Perhaps the full version with shape, strides and suboffsets will be all that is needed (shrug).

robertwb commented 15 years ago

@dagss changed Perhaps some variation on

typedef struct {
  Py_ssize_t shape, stride, suboffset;
} __Pyx_Buf_DimInfo;

typedef struct {
  __Pyx_buffer* bufinfo;
  char* data;
  __Pyx_Buf_DimInfo diminfo[2];
} __Pyx_Buffer_2D;

Then one could have a generic version without a size in diminfo? Not sure how C works here...

robertwb commented 15 years ago

@dagss changed Apparently most C89 compilers will allow

typedef struct {
   ...
   __Pyx_Buf_DimInfo diminfo[__Pyx_Buffer_2D;

and then malloc-ing too much memory for it. Using diminfo[](1] }) is a C99 feature which we thus can't use.

Not really useful as we must allocate on the stack.

robertwb commented 15 years ago

@dagss changed http://codereview.appspot.com/63153/show