Open renanmcosta opened 1 year ago
For now I've managed to fetch with the temporary fix below. I don't think it's very robust, but I'm copying it here in case it's informative.
def read_cell_array(self):
"""deserialize MATLAB cell array"""
n_dims = self.read_value()
shape = self.read_value(count=n_dims)
n_elem = int(np.prod(shape))
result = [self.read_blob(n_bytes=self.read_value()) for _ in range(n_elem)]
if n_elem != len(np.ravel(result, order="F")): # if not all elements are scalars. shouldn't work for ragged arrays
shape = (-1,) + tuple(shape[1:n_dims])
return (
self.squeeze(
np.array(result).reshape(shape, order="F"), convert_to_scalar=False
)
).view(MatCell)
Greetings,
I have just encountered the same problem, and temp fix seems to work (Thanks a lot @renanmcosta)
Temporary fix returns an array but with shape = (537000, 2).
In matlab its an 1×2 cell array {10×5370×10 single} {10×5370×10 single}.
type(temp_fixed) --> datajoint.blob.MatCell
Am I able to retrieve the original dimensions or this is a robustness problem of the temporary fix?
Thanks in advance
Hi @Paschas, could you update us on this? We are looking to resolve this.
Greetings,
I have just encountered the same problem, and temp fix seems to work (Thanks a lot @renanmcosta)
Temporary fix returns an array but with shape = (537000, 2). In matlab its an 1×2 cell array {10×5370×10 single} {10×5370×10 single}.
type(temp_fixed) --> datajoint.blob.MatCell
Am I able to retrieve the original dimensions or this is a robustness problem of the temporary fix?
Thanks in advance
The temp fix is responsible for the shape differences there. Lately, I have been using a simpler fix, which shouldn't collapse any dimensions. This is one should always work, though it's possible that it can lead to awkward array nesting at times.
def fix_cell_array_fetch():
"""Fixes bug that prevents cell arrays from being fetched in python in certain
cases. Replaces cell array unpacking method in the datajoint module with working
version.
"""
class Blob(dj.blob.Blob):
def read_cell_array(self):
"""deserialize MATLAB cell array"""
n_dims = self.read_value()
shape = self.read_value(count=n_dims)
n_elem = int(np.prod(shape))
result = [self.read_blob(n_bytes=self.read_value()) for _ in range(n_elem)]
return (
self.squeeze(np.array(result, dtype="object"), convert_to_scalar=False)
).view(dj.blob.MatCell)
dj.blob.Blob = Blob
Let's see if we can incorporate this in this coming release.
Bug Report
Description
Fetching fails in python when each entry for a given attribute (defined in matlab) is a cell array, and each element of the cell array is an array of doubles. Fetching in matlab works as expected.
Reproducibility
Windows, Python 3.9.13, DataJoint 0.13.8
Steps:
epoch_pos_range=null : blob # list of y position ranges corresponding to n epochs in epoch_list, (e.g., {[y_on y_off],[y_on y_off]} for epoch_list {'epoch1','epoch2'})
Error stack: