Closed davydog187 closed 1 week ago
Hi this is a known limitation in OpenCV as how it was designed specifically for 2D images. The solution is to use Evision.Mat.from_nx_2d/1
to convert the last dimension as channels and it will work.
v = Nx.broadcast(Nx.tensor(1.0, type: :f64), {1200, 1920, 3})
Evision.gaussianBlur(Evision.Mat.from_nx_2d(v), {31, 31}, 0)
%Evision.Mat{
channels: 3,
dims: 2,
type: {:f, 64},
raw_type: 6,
shape: {1200, 1920},
ref: #Reference<0.555854501.2917531664.244421>
}
And here are some more context for this, https://github.com/cocoa-xu/evision/wiki/Integration-with-Nx
iex> %Evision.Mat{} = mat = Evision.imread("/path/to/image.png")
iex> t = Evision.Mat.to_nx(mat)
# convert a tensor to a mat
iex> mat_from_tensor = Evision.Mat.from_nx(t)
%Evision.Mat{
channels: 1,
dims: 3,
type: {:u, 8},
raw_type: 0,
shape: {512, 512, 3},
ref: #Reference<0.1086574232.1510342676.18186>
}
# Note that `Evision.Mat.from_nx` gives a tensor
# however, some OpenCV functions expect the mat
# to be a "valid 2D image"
# therefore, in such cases `Evision.Mat.from_nx_2d`
# should be used instead
#
# Noticing the changes in `channels`, `dims` and `raw_type`
iex> mat_from_tensor = Evision.Mat.from_nx_2d(t)
%Evision.Mat{
channels: 3,
dims: 2,
type: {:u, 8},
raw_type: 16,
shape: {512, 512, 3},
ref: #Reference<0.1086574232.1510342676.18187>
}
# and it works for tensors with any shapes
iex> t = Nx.iota({2, 3, 2, 3, 2, 3}, type: :s32)
iex> mat = Evision.Mat.from_nx(t)
%Evision.Mat{
channels: 1,
dims: 6,
type: {:s, 32},
raw_type: 4,
shape: {2, 3, 2, 3, 2, 3},
ref: #Reference<0.1086574232.1510342676.18188>
}
Thanks for the extra context, @cocoa-xu
After reading through your example and the linked wiki, Im still unsure why Evision can't automatically cast the tensor into the right shape in this situation. Numpy and OpenCV can handle this seamlessly which yields intuitive code
m = np.zeros((1200, 1920, 3), dtype="uint8") r = cv.GaussianBlur(m, (3, 3), 0) r.shape (1200, 1920, 3) same thing
It's unclear to me why we can't achieve the same thing with Evision without knowing to call Evision.Mat.from_nx_2d
Because in Python there's no representation of cv::Mat
-- they're basically a wrapped class (cv::UMat
) around a numpy object instead.
We surely can call these functions implicitly when the input args are Nx.t()
and when the outputs are Evision.Mat
Yet the catch is, in opencv-python, the memory is shared between the wrapped class and numpy while in Erlang we cannot give Erlang an ErlNifBinary
with mutable data in it, which would fundamentally break the immutability of Erlang (actually it's more likely to crash the Erlang process).
Therefore, when convert to a Nx.t()
(with native backend), we have to make a copy of the data, and it's the same vice-versa. If we want to achieve the same thing like opencv-python and numpy, then we have to make Evision as a backend of Nx
, which is indeed one thing on the roadmap. (You can use Evision.Backend
as the backend of an Nx
tensor)
However, unlike PyTorch or XLA, cv::Mat
in OpenCV does not support some required Nx callbacks out of the box, so we have to write them from scratch. Sadly I don't have enough time to test and optimise them (maybe we can copy some from PyTorch or XLA).
Besides that, even if we can use Evision.Backend
for Nx
, OpenCV's cv::Mat
doesn't support :s64
, :u32
and :u64
types, yet they're very commonly seen in today's ML/Deep Learning/AI workflows, plus that Nx
by default would choose :s64
when initialise a tensor. A workaround is described in the Wiki page but it's far from perfect.
Furthermore, due to the limitation in OpenCV, it's hard to patch the source code to support these types in this project, https://github.com/cocoa-xu/evision/issues/48#issuecomment-1266282345. It would basically require us to rewrite the cv::Mat
class.
Thanks for the detailed response @cocoa-xu!
We surely can call these functions implicitly when the input args are Nx.t() and when the outputs are Evision.Mat
Right this is the the behavior that would be most intuitive
Yet the catch is, in opencv-python, the memory is shared between the wrapped class and numpy while in Erlang we cannot give Erlang an ErlNifBinary with mutable data in it, which would fundamentally break the immutability of Erlang (actually it's more likely to crash the Erlang process) ...
After re-reading this a few times, I'm unclear on what conclusion this leads us. If we call the functions to mimic the behavior of Evision.Mat.from_nx_2d
, are you suggesting that we would be breaking the semantics of Erlang? Couldn't Evision literally do that in its generated glue code, or does it lower into the NIF in some fundamental way that I'm not understanding?
Closing this as it should be addressed in #251 ;)
I tried to reproduce the issue below, we are able to run the same code via python/numpy and it works as expected. After chatting with @polvalente he seems to think its a bug in Evision.
Gaussian Blur