Closed xiaor2 closed 1 year ago
What is the chunk shape of your array. (You can find this in the .zarray
metadata under chunks). Zarr operates at the "chunk" level, so the most granular requests that it can make are for individual chunks. It will load whatever chunks are necessary to complete your desired selection (and cannot load partial data from within chunks). You can have more fine-grain control over how requests are made in the client but chunking your data differently to optimize for the type of access you intend to make in the client.
What is the chunk shape of your array. (You can find this in the
.zarray
metadata under chunks). Zarr operates at the "chunk" level, so the most granular requests that it can make are for individual chunks.
The chunk shape is (3, 3344, 3344). In my case, what chunk shape would you suggest?
This is really hard to suggest without knowing your data or any benchmarking. From the single query above, it seems like you want to be able to access an individual (1, 1, 52411)
view of this cube. Will you make repeated accesses of this shape, or intend to load different shaped views? When deciding on a chunk shape, think of the queries you are likely to make the most and then probably chunk in a way that benefits that type of request.
For now, I will only use this shape (1, 1, 52411)
. Is the (1, 1, 52411) best chunk shape?
That would make sense to make, especially if the (1, 1, 52411)
views are accessed somewhat uniformly at random (i.e., two nearby chunks aren't likely to get accessed together). It's probably worth reading the zarr documentation on chunk size and shape to learn more.
Thank you! I used the (1, 100, 52411) and it works well.
But I have encountered another problem. I have five arrays with shape of (3, 52411, 52411). They have the same chunk size of (1, 100, 52411). And I saved them on the AWS cloud. When I use openArray
to get them, four of them have the correct chunk size. But one of them has the chunk size of (3, 3344, 3344). But all the .zarray
metadata of the five arrays are the same, it doesn't make sense to have a (3, 3344, 3344) in one of them. Is there any potential mistake causing this?
Perhaps try clearing your browser cache. Sometimes there can be issues with updating data on s3, but this shouldn't be a zarr.js issue.
Yes, that solves my problem! Appreciate your suggestion!!
Hi, I tried to load an array of shape (3, 52411, 52411) from AWS S3. And I used a filter in the get function.
data = SOA.get([0, id, slice(null, 52411)]).then(async data => await data.data)
Because I used a filter, I should get one-dimensional array with length of 52411. I think it should save me some time since it loads a much smaller array. However, it takes the same time as loading the whole array. Is there any way to save time for that?