Open Alvalunasan opened 1 year ago
Thank you for submitting this @Alvalunasan. I think I understand the problem.
Hi @dimitri-yatsenko In brodylab this error has resurfaced with the new datajoint integration. Is it possible that this gets merged ? (or the bug fixed somehow) ?
Thank you very much for your help
Hi @dimitri-yatsenko , a note that this is not restricted to old Matlabs-- it was happening with a recent Matlab , and on data from August 2023. Maybe newer data too, I haven't yet checked on newer data.
Would it be appropriate to merge Alvaro's patch?
@Alvalunasan updates his patch and suggests replacing lines 495-499 with
sizes_array = [x.size for x in result]
sum_sizes = sum(sizes_array)
if n_elem ==0:
return np.array(np.empty(0)).view(MatCell)
elif sum_sizes == 0:
return (self.squeeze(np.array(np.empty(shape, dtype=type(result[0]))), convert_to_scalar=False)).view(MatCell)
else:
return (self.squeeze(np.array(result).reshape(shape, order="F"), convert_to_scalar=False)).view(MatCell)
ok, will incorporate asap. We are starting to work on a new release. Thanks.
See next coment
@dimitri-yatsenko , @carlosbrody
New corner case for the function (Cell Matrix reading, instead of only nx1 vectors):
def read_cell_array(self):
"""deserialize MATLAB cell array"""
load_as_object = False
n_dims = self.read_value()
shape = self.read_value(count=n_dims)
n_elem = int(np.prod(shape))
result = [self.read_blob(n_bytes=self.read_value()) for _ in range(n_elem)]
# If it is a matrix (and not a nx1 vector) load as object
if np.sum(shape > 1) > 1:
load_as_object = True
# Check size for each element (could have Empty elements in vector)
if n_elem > 0:
# If there are arrays, tuple or list inside elements of result, load as object (except if all emptys)
if isinstance(result[0], np.ndarray):
sizes_array = [x.size for x in result]
sum_sizes = sum(sizes_array)
load_as_object = True
elif isinstance(result[0], tuple) or isinstance(result[0], list):
sizes_array = [len(x) for x in result]
sum_sizes = sum(sizes_array)
load_as_object = True
else:
sum_sizes = n_elem
# If no trials in array
if n_elem ==0:
return np.array(np.empty(0)).view(MatCell)
# If all trials contains "empty" data
elif sum_sizes == 0:
return (self.squeeze(np.array(np.empty(shape, dtype=type(result[0]))), convert_to_scalar=False)).view(MatCell)
# If some trials contains data and others contains "empty" data
elif sum_sizes != n_elem or load_as_object:
return (self.squeeze(np.array(result, dtype='object').reshape(shape, order="F"), convert_to_scalar=False)).view(MatCell)
# Regular case, all trials contains data
else:
return (self.squeeze(np.array(result).reshape(shape, order="F"), convert_to_scalar=False)).view(MatCell)
Excellent. I am traveling until next week and will work on this when I return. Thank you so much for this solution.
I understand this was resolved and traced to a 32-bit compilation of mym
in Matlab. Can we close this @Alvalunasan ?
Hi @Alvalunasan @carlosbrody
I just reviewed the blob code and found that we have a setting to switch to 32-bit length econding. You can turn it on by doing
dj.blob.use_32bit_dims = True
This should fix your issue. Please try and let me know. I will close this issue then.
Bug Report
Description
Reading an empty cell array inserted with mym MATLAB fails to be read in Datajoint python
Reproducibility
I have a corner case for reading some special. blobs in Datajoint Python when these are stored with mym Matlab: Here is the type of blob stored in the DB and read on Matlab:
As you can see, what is stored in a part of the blob is a 3x1 cell array composed of empty items:
When trying to read this data in Python, I got this error:
I have “patched” the blob.py code read_cell_array function with:
Just to add the case that the size of the array is zero (numpy array size is 0 if it’s filled with empty arrays) Probably not the cleanest way to do it.
Expected Behavior
To get something similar to this when reading this kind of blobs: