Closed Saksham20 closed 3 years ago
Yes, DatasetView(d1).lazy_transpose([2,0,1])[:].shape
would read the whole dataset into memory
Can you share more about your setup? I'm getting the correct shape for
d1 = f.create_dataset(data=np.random.rand(2,3,3),name="dataset") print(DatasetView(d1).lazy_transpose([2,0,1]).shape) # output is 3,2,3
If lazy_slicing the whole dataset, and then calling shape, it should not load the dataset. This is the behavior of h5py.Dataset
too. Except those attributes that are for reading the data, mainly []
without lazy_slice, and dsetread()
, DatasetView
should generally access attributes without loading the data, similar to h5py.Dataset
.
I think by "slicing the whole dataset" @Saksham20 means using [:]
before the .shape
. Is that right, Saksham?
@d-sot is that true of the latest release, or the master branch?
It is true of the Master branch. The release package is now updated.
@Saksham20 upgrading to the new release should fix this issue.
I think by "slicing the whole dataset" @Saksham20 means using
[:]
before the.shape
. Is that right, Saksham?
Yes
@Saksham20 upgrading to the new release should fix this issue.
I'm actually getting an error loading numcodecs
which is a requirement of zarr
which is a req of lazy_ops.
I tried pip installing it separately but get the same error:
Error is quite long to put here but pip is unable to build the package from numcodec.tar.gz
file, builds wheel file fails again, then tries to build the package from setup file but fails yet again.
The workaround is that I downloaded a .whl file from here and did pip install. Having done this, it works as you say. (I have python 3.7)
@bendichter this is a similar workaround for sima installation error that I had had. Maybe its a Python version issue again.
@Saksham20 hmm looks like numcodecs is compatible with the latest versions of python in their tests, but I don't know if they have released this. @d-sot, can you restructure this so that zarr
is not required? Similar to the way SpikeInterface does this?
zarr dependency is not currently required. PR #20.
h5py dataset:
h5py_dataset.shape # output = 2,3,3
After lazy_transpose I get:DatasetView(d1).lazy_transpose([2,0,1]).shape # output = 2,3,3 / should be 3,2,3
Doing this gives the correct dim:DatasetView(d1).lazy_transpose([2,0,1])[:].shape # output = 3,2,3
Also, if slicing the whole dataset and then calling shape, will it result in loading the whole dataset in memory?