Open stuarteberg opened 8 years ago
Thanks for reporting. This code
output_zyxc = np.zeros( (100,100,100,10), dtype=np.float32 )
output_zyxc = vigra.taggedView(output_zyxc, 'zyxc')
vigra.filters.hessianOfGaussianEigenvalues(input_zyx, 1.0, out=output_zyxc[..., 0:3])
should throw an exception. I meant to check for this case in the converters, but apparently forgot a condition.
The reason it doesn't work is this: when one maps a Python array with explicit channel axis onto a vigra array with value type TinyVector<float, 3>
, the x-stride must be divided by 3. This is only legal if the original stride was a multiple of 3, which is not true here because the original length of the channel axis (and therefore the x-stride) is 10.
BTW, the first two lines of your code could be written compactly as
output_zyxc = vigra.Volume((100,100,100,10), order='C')
I uploaded PR #380 which applies two changes:
vigranumpy/__init__.py
that transparantly uses a temporary array if out
has incompatible strides.This bug raises an interesting general question: vigra measures strides in units of sizeof(T)
. Alternatively, one can measure strides in bytes (which numpy does). In this case, your attempted slicing would have worked, but there are some downsides as well:
What do you think?
Just another data point: Qt (QImage) also measures strides in bytes.
And what is your own opinion, @hmeine?
I don't have a strong opinion either way (which is also the TL;DR summary), but I experienced difficulties (probably in the context of qimage2ndarray) when working with different systems at the same time that require conversion between byte and pixel strides. In particular, it may lead to runtime errors when trying to convert byte (Qt, numpy) to pixel strides (vigra) if the former are not compatible (divisible). That's probably an unlikely case, but requires extra code for checking and handling this condition anyhow. (Very much like what's discussed here.) I don't know if there's any performance difference. (Probably not anymore nowadays, and/or not relevant.) In the absence of evidence, I would vote against premature optimization and for byte strides. On the other hand, we don't implement vigra from scratch, so we're talking about a behavior change. That's probably the strongest point, which suggests not to change it.
For my needs, the error-check you've already proposed is good enough. (Thanks, BTW.)
I don't quite follow the explanation of why switching to byte-based strides is more powerful than sizeof(T)
-based strides. But in any case, it sounds like such a change is likely to introduce subtle bugs. The vigra test suite is pretty good, but it's hard to cover all the potential problems such a change could introduce (as this issue demonstrates).
I don't quite follow the explanation of why switching to byte-based strides is more powerful than
sizeof(T)
-based strides.
Using, sizeof(T)
-based strides the best I can do is issue an error message and/or transparently use an auxiliary array with matching strides. Using byte-based strides, your use case would Just Work (the x-axis stride would be 40, regardless of the channels of your view output[...,:3]
to be interpreted as a three floats or a TinyVector<float, 3>
).
The question if the strides should change refers mainly to vigra 2, should this ever materialize.
It seems to me that sizeof
strides are safer on the C++ side from issues of undefined behaviour (e.g., unaligned memory access). On the other hand, it sounds like bytes-based strides might make interoperability with NumPy easier. Maybe some type of checked bytes strides could be a good compromise?
Is it permitted to call vector-valued convolutions (e.g.
hessianOfGaussianEigenvalues
) with a non-contiguousout
parameter? For example:That doesn't throw an exception, but it doesn't produce the correct results. Instead, you get an image that has obvious artifacts from incorrect pointer math:
Furthermore, the other slices of the output array (which were supposed to remain untouched) get corrupted:
Here's a complete test case. This example uses
hessianOfGaussianEigenvalues()
, but I can also reproduce the problem withstructureTensorEigenValues()
. The problem is NOT reproducible with single-channel outputs, such aslaplacianOfGaussian()
.