keller-mark / pizzarr

Slice into Zarr arrays in R 🍕
https://keller-mark.github.io/pizzarr/
Other
29 stars 3 forks source link

Compatibility with xarray for #18 and R CMD check cleanup #85

Closed dblodgett-usgs closed 5 months ago

dblodgett-usgs commented 5 months ago

Likely some other checks, I'll comment in #18 with what I checked so far.

dblodgett-usgs commented 5 months ago

pizzar_chain copy

Here's a social media sized logo to add to the repository in settings @keller-mark

dblodgett-usgs commented 5 months ago

Been doing a whole bunch of testing and investigation of the array and nested array internals. I uncovered a handful of bugs in the process.

  1. fixed an issue with the zarr version "2" being "[2]" which stopped zarr-python from reading pizzarr zarr files.
  2. added code to deal with step size and added testing for step size selections
  3. got direct R array values working in NestedArray$set() and tested (this code appeared to be very lightly tested if at all?)
  4. added some inline documentation for selection and indexing
  5. rewrote is_scalar() to be what I think you mean it to be?!? JKLOL -- I need to learn more about the use of jsonlite::unbox and the scalar class.
  6. got selection working and tested with ":" and integers.
  7. contributed the vignette I've been working on to organize my understanding of all this stuff.

Open questions in test-nested-array.R:

  1. Should the set method of NestedArray support integer and "..."/":" indexing?
  2. Should the set method of NestedArray update the parent array ?
  # Should this be the case?!?
  expect_error(a[1:10, 1:20]$set("...", new_vals), 
               "selection must be a list of slices")
a[1:10, 1:20]$set(list(slice(1,10), slice(1,20)), new_vals)

  # it does not update the original array -- should it?
  expect_equal(a[1:10, 1:20]$as.array(),
               vals)
dblodgett-usgs commented 5 months ago

This latest commit gets the tests passing, but the handling of is_scalar vs is_integer is maybe not 100% -- very much worth a review on those details.

dblodgett-usgs commented 5 months ago

With my latest commits, I see no errors, no warnings, and one note in check. The note is the dog.ome.zarr size issue noted in #87.

keller-mark commented 5 months ago

Should the set method of NestedArray support integer and "..."/":" indexing? Should the set method of NestedArray update the parent array ?

Good point, this could get confusing, especially when using the bracket access notation on the ZarrArray since it may be less obvious that what you get back is a NestedArray. We could check how the analogous pattern in Python behaves and then follow that. Alternatively, maybe we need a mechanism for the user to opt-in or opt-out of updating the parent (might be useful/needed in cases where the Store is read-only). Another option may be to try to get away from the NestedArray by returning plain arrays in more cases.

keller-mark commented 5 months ago

the handling of is_scalar vs is_integer is maybe not 100%

Good point, these and is_integer_vec should to be renamed to be more clear what they are doing. I will do that and add some more comments/tests. Luckily I think they only need to be used in a handful of cases.