pangeo-data / rechunker

Disk-to-disk chunk transformation for chunked arrays.
https://rechunker.readthedocs.io/
MIT License
163 stars 25 forks source link

Add options for zarr array definition #48

Closed eric-czech closed 4 years ago

eric-czech commented 4 years ago

Fixes https://github.com/pangeo-data/rechunker/issues/46

This adds options to rechunk that are passed to zarr create methods. The only use cases I have in mind for this are allowing for overwrites and compression, but it may be useful in other ways. Otherwise it may eventually be worth lifting those parameters into the rechunk signature given that allowing any option to be passed is a bit of a footgun.

Note: I had to modify one of the existing test fixtures since it was doing almost what I wanted for a compression test except that the chunks were too small for compression to take effect. I didn't know that occurred with zarr but I can't find any other explanation for why specifying a compressor makes no difference when an array is too small. I have no idea where the cutoff is but perhaps it is related to compression block sizes? Let me know if anybody understands that better.

codecov[bot] commented 4 years ago

Codecov Report

Merging #48 into master will increase coverage by 0.10%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #48      +/-   ##
==========================================
+ Coverage   94.89%   95.00%   +0.10%     
==========================================
  Files          10       10              
  Lines         392      400       +8     
  Branches       75       78       +3     
==========================================
+ Hits          372      380       +8     
  Misses         10       10              
  Partials       10       10              
Impacted Files Coverage Δ
rechunker/algorithm.py 82.45% <ø> (ø)
rechunker/api.py 92.53% <100.00%> (+0.47%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e454e85...ac03623. Read the comment docs.

eric-czech commented 4 years ago

Thanks @tomwhite. Let me know if there's anything else I can do to get this merged @TomAugspurger / @rabernat.

TomAugspurger commented 4 years ago

Thanks.

rabernat commented 4 years ago

Thanks for all your contributions @eric-czech. It's great to have you involved in rechunker!

eric-czech commented 4 years ago

Sure thing @rabernat!