Closed peterdudfield closed 1 year ago
Suggestion from @jacobbieker is to satellite_xr.rechunk({'time': 12, 'variable': 11, 'x': 298, 'y': 615})
and resave it
What if you try that on Leonardo? Or Donatello? Does it still take that much longer to load than for sum?
whats the location on Leonardo?
/mnt/storage_ssd_8tb/data/ocf/solar_pv_nowcasting/nowcasting_dataset_pipeline/satellite/EUMETSAT/SEVIRI_RSS/zarr/v3/eumetsat_seviri_uk.zarr
/mnt/storage_ssd_4tb/metnet_train/eumetsat_seviri_uk.zarr
on Donatello, mnt/storage_ssd_8tb/data/ocf/solar_pv_nowcasting/nowcasting_dataset_pipeline/satellite/EUMETSAT/SEVIRI_RSS/zarr/v3/eumetsat_seviri_uk.zarr
on leonardo
tried just using xr.open_zarr(zarr_path="gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/v3/eumetsat_seviri_uk.zarr")
to isolate the problem, this gets it down to more like 1 second
What if you try that on Leonardo? Or Donatello? Does it still take that much longer to load than for sum?
It also took 5 seconds on Leonardo for OpenSatellite
tried just using
xr.open_zarr(zarr_path="gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/v3/eumetsat_seviri_uk.zarr")
to isolate the problem, this gets it down to more like 1 second
It will probably worth checking what slows down OpenSatelliteIterDataPipe
, but 1 second still feels way too slow.
tried just using
xr.open_zarr(zarr_path="gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/v3/eumetsat_seviri_uk.zarr")
to isolate the problem, this gets it down to more like 1 secondIt will probably worth checking what slows down
OpenSatelliteIterDataPipe
, but 1 second still feels way too slow.
0.7 seconds, still seems slow, but better than 5
Yea, 0.7 seconds is not very good for loading one example, as one batch for example will be 32, and that'll add up quickly, to 22.4 seconds per batch
There seems to be a difference with
dataset = xr.open_zarr(store=zarr_path)
and
dataset = xr.open_dataset(leonardo_path, engine="zarr", chunks="auto")
The first is much quicker
I'll pay a PR, as this feels a big enough different and is slowly down training. Further speed ups im sure can be made
I've been creating various test zarrs, all of 3 days, or 864 timesteps, on leonardo under /mnt/storage_c/
going through options of chunking along time, size along spatial chunks, type of compression, number of channels included, and more. So we should have a better idea soon which one works best for us. They are all being created from the JPEG-XL compressed single timesteps saved on leonardo, not the raw files, to make it quicker to create them right now.
That sounds great, yea we should should see what size they are, and maybe have a tiny bit of code, to see how long one example takes to load. Thanks for doing this @jacobbieker
Yeah, I am getting the sizes now, and was planning on copying the code above with a few changes to see how it affects loading times. But just to compare it to the current ones.
For sizes so far, compressing HRV files, the clear winner is JPEG-XL, its beating out bz2, zstd, and zlib by quite a lot, without needing to make the data into ints. It still is being compared to BitRound, Quantize, and ZFPY compressions too though, which are all somewhat lossy as well, like JPEG-XL. But, compared to the losses ones, for 3 days (864 timesteps) of data, the JPEG-XL takes 818MB of space, while BZ2 takes 39GB, so its only 2.1% the size of the worst compression. That would put 1 year of HRV data at ~100GB with JPEG-XL vs 4.75TB, and the whole 2014-2022 dataset at around 38TB for BZ2 vs 800GB for JPEG-XL. Still need to see how the 11 other non-HRV channels differ, and more chunk sizes, but size-wise it is a lot smaller. The best compression in size other than JPEG-XL so far is bz2 with the data as int8s, and 12 timesteps per chunk, at 2 GB for the 3 days of data, or twice the size of JPEG-XL.
Compressions | Size | Type | Speed |
---|---|---|---|
Current | TODO | TODO | TODO |
JPEG-XL | 0.8 GB | TODO | TODO |
BZ2 | 2 GB | int8 | TODO |
Worth filling this out @jacobbieker
Compressions Size Type Speed Current TODO TODO TODO JPEG-XL 0.8 GB float32 TODO BZ2 2 GB int8 TODO Worth filling this out @jacobbieker
Thanks, I'll fill it out as I go
In looking at https://github.com/observingClouds/xbitinfo for the amount of "real information" in the satellite imagery. For keeping 99% of the information, we can BitRound down to 7 bits, or 99.9%/99.99% by going up to 11 bits. I believe 99% is probably good enough, and I'll try that out
The fastest 10 loading times from leonardo onto Donatello for 378 combinations of HRV data. If Bitround is 0.0 it means there is no rounding done. Precision of 8 means int 8, while 16 is in half-precision and 32 is full prevision.
For the tradeoff between space and size, the two winners seem to be JPEG-XL, or ZFP with timestep chunks of 4. ZFP is easier to use and doesn't have the issues that JPEG-XL is slightly harder to install support for (i.e. can't seem to install the library on Leonardo, as its not in the Ubuntu or Debian repos) vs ZFP. ZFP does require 4x4x4 chunks though, so is very inefficient for chunks that are not multiples of 4, and if the chunks are not multiples of 4 it pads each chunk to be a multiple of 4.
Note on the bitrounding, for the satellite data testing, 9-10 bits keeps 99% of the data and 11 bits keeps 99.99% of the data, so the JPEG-XL version keeps slightly more data than the zfp or zstd compressions.
Speed: 0.15328574180603027 sec
Size: 2.4G
Precision: 8
Algo: zstd
Bitround: 0.0
Timestep chunk: 12
Effort: 8
Y chunk: 1044 X chunk: 1392
Speed: 0.18117666244506836 sec
Size: 2.5G
Precision: 8
Algo: zstd
Bitround: 0.0
Timestep chunk: 12
Effort: 4
Y chunk: 1044 X chunk: 1392
Speed: 0.18256306648254395 sec
Size: 51M
Precision: 32
Algo: jpeg-xl
Bitround: 11.0
Timestep chunk: 1
Effort: 8
Y chunk: 1044 X chunk: 1392
Speed: 0.19475793838500977 sec
Size: 51M
Precision: 16
Algo: zfp
Bitround: 10.0
Timestep chunk: 4
Effort: 8
Y chunk: 1044 X chunk: 1392
Speed: 0.1978597640991211 sec
Size: 51M
Precision: 16
Algo: zstd
Bitround: 10.0
Timestep chunk: 4
Effort: 8
Y chunk: 1044 X chunk: 1392
Speed: 0.23715853691101074 sec
Size: 2.5G
Precision: 8
Algo: zlib
Bitround: 0.0
Timestep chunk: 12
Effort: 8
Y chunk: 1044 X chunk: 1392
Speed: 0.24122095108032227 sec
Size: 2.0G
Precision: 8
Algo: bz2
Bitround: 0.0
Timestep chunk: 12
Effort: 8
Y chunk: 1044 X chunk: 1392
Speed: 0.24934720993041992 sec
Size: 51M
Precision: 16
Algo: zfp
Bitround: 7.0
Timestep chunk: 1
Effort: 8
Y chunk: 1044 X chunk: 1392
Speed: 0.2642090320587158 sec
Size: 51M
Precision: 16
Algo: zfp
Bitround: 10.0
Timestep chunk: 1
Effort: 8
Y chunk: 1044 X chunk: 1392
Speed: 0.2651398181915283 sec
Size: 51M
Precision: 16
Algo: jpeg-xl
Bitround: 7.0
Timestep chunk: 1
Effort: 8
Y chunk: 1044 X chunk: 1392
Based off of this, I would probably recommend that we go with either ZFS or ZSTD for the compression algorithm. ZSTD is what we are using for the NWP data, and doesn't need another install like ZFP, and is not as particular about the 4x4x4 blocks as ZFP is. I would go with bitrounding to 10 bits, using fp16, and timestep chunks of 4 (~20 minutes each), with these larger spatial chunks. Smaller chunking in x and y was also tested, but these larger chunks allow for greater compression and apparently easier to load.
thanks for doing all this, really useful to have it written down. Is it right we should go for the first one or the third one? First one is the fastest, but the third is the smallest? (Amazing difference in size btw). Im not quite sure on the trade-offs of ~15% slower to load, but reduce size by 98%.
I think the 3rd one because its so much smaller, it'll just have to transfer so much more data over a network with the larger one.
For the 8 years of satellite data, this corresponds to about 1.9TB for the first one, and 40GB for the third one
So actually not that big for the larger one, might try just making both? 40GB is incredibly small, and 2TB is pretty small too, relatively
Downside for the first one as well is that its int8, so theoretically is losing more information than the other ones
40GB sounds ideal - good for sharing too / running local stuff
Yeah, I'll run some more tests and do some plotting of it, to make sure its not just all rounded down to 0s or something, but then start on that
Hmmm, I just remade the best running ones according to this, and they are much larger, 12GB instead of 51MB. I'm thinking this is because to speed up testing a bit, for most of these, I created one 3 day Zarr with JPEG-XL, then created these other options based off that Zarr. Which means it was compressed once with JPEG-XL, then compressed again with the bitrounding, etc. But still need to look into it some more.
The fastest ones still all have the same spatial chunking, and either 4 or 12 timesteps, so just trying it again. But still on track to make the int8 version, which is the fastest anyway, and whose size makes more sense, as thats been more constant since the beginning.
Ah, it seems like the very small ones are nearly all NaNs, which explains how small they get, I'm assuming just from the multiple lossy compressions. But the int8 one should be still fine at least, just no 40GB file
Also now trying Blosc2, in this little wrapper:
from numcodecs.registry import register_codec
from numcodecs.abc import Codec
from numcodecs.compat import ensure_contiguous_ndarray
import blosc2
class Blosc2(Codec):
"""Codec providing compression using the Blosc meta-compressor.
Parameters
----------
cname : string, optional
A string naming one of the compression algorithms available within blosc, e.g.,
'zstd', 'blosclz', 'lz4', 'lz4hc', 'zlib' or 'snappy'.
clevel : integer, optional
An integer between 0 and 9 specifying the compression level.
See Also
--------
numcodecs.zstd.Zstd, numcodecs.lz4.LZ4
"""
codec_id = 'blosc2'
max_buffer_size = 2**31 - 1
def __init__(self, cname='blosc2', clevel=5):
self.cname = cname
if cname == "zstd":
self._codec = blosc2.Codec.ZSTD
elif cname == "blosc2":
self._codec = blosc2.Codec.BLOSCLZ
self.clevel = clevel
def encode(self, buf):
buf = ensure_contiguous_ndarray(buf, self.max_buffer_size)
return blosc2.compress(buf, codec=self._codec, clevel=self.clevel)
def decode(self, buf, out=None):
buf = ensure_contiguous_ndarray(buf, self.max_buffer_size)
return blosc2.decompress(buf, out)
def __repr__(self):
r = '%s(cname=%r, clevel=%r)' % \
(type(self).__name__,
self.cname,
self.clevel,)
return r
register_codec(Blosc2)
With Blosc2, the best option of the two up there is Zstd It gives, for 1 day of data, 3.9GB file with FP16, which gives around 10TB for HRV data for the last 8 years. If saved as int8, then its 2TBish for the last 8 years of HRV.
These are some of the best and so far fastest loading ones. They are all still larger than the JPEG-XL version, but are easier to use and quicker to decode and encode.
I think the way forward for this is to create 3 versions of each of HRV and non HRV for now. One would be the int8 version that is smaller and faster to load, one that is JPEG-XL for keeping as much of the original data as possible and easier to share size-wise, and the FP16 one for greater precision, while still being faster to load than JPEG-XL, although much larger. Sound good @peterdudfield @devsjc? The FP16 one might use up a lot of storage_c, but yeah.
The ones other than the JPEG-XL one would chunk in 12 timesteps at a time in time, and in the 1392x1044 spatial chunks, nonHRV would be also chunked all 11 channels together. This should cut down a lot on the number of requests our models need to read from disk.
Although for some reason, the compression seems to do worse if chunking all 11 channels together, instead of separately.
I think the way forward for this is to create 3 versions of each of HRV and non HRV for now. One would be the int8 version that is smaller and faster to load, one that is JPEG-XL for keeping as much of the original data as possible and easier to share size-wise, and the FP16 one for greater precision, while still being faster to load than JPEG-XL, although much larger. Sound good @peterdudfield @devsjc? The FP16 one might use up a lot of storage_c, but yeah.
The ones other than the JPEG-XL one would chunk in 12 timesteps at a time in time, and in the 1392x1044 spatial chunks, nonHRV would be also chunked all 11 channels together. This should cut down a lot on the number of requests our models need to read from disk.
Could you give the size and load times like you did above, for these different methods.
Although for some reason, the compression seems to do worse if chunking all 11 channels together, instead of separately.
how does this affect size and loading speed?
Although for some reason, the compression seems to do worse if chunking all 11 channels together, instead of separately.
how does this affect size and loading speed?
For size, with the blosc2 zstd fp32, its 19.2GB for the 11 channels chunked, and 13.9GB for individual channels. 10.5GB vs 7.5GB after bitrounding down to 13 bits, and and 4.2GB vs 2.8GB for uint8, so between 30 and 50% larger. As for loading speed, I'll add it in with the other ones in a bit, going to do stuff.
Here is some speed outputs, some of the decoding: 8 means uint8, 16 is fp16, 32 is fp32, t
Speed: 0.07168388366699219 sec
Size: 4.0G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_uint16_blosc2_zstd_xbitinfo_t12_round0
Speed: 0.07281303405761719 sec
Size: 813M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_8_zstd_xbitinfo_t4_c1
Speed: 0.07943344116210938 sec
Size: 3.9G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_fp16_blosc2_zstd_xbitinfo_t12_round0
Speed: 0.10940337181091309 sec
Size: 6.5G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_zstd_xbitinfo_round_13
Speed: 0.11301827430725098 sec
Size: 6.1G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_blosc2_zstd_xbitinfo_t4_round13
Speed: 0.11818528175354004 sec
Size: 547M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_jpeg_xl_xbitinfo_y
Speed: 0.12987041473388672 sec
Size: 547M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_jpeg_xl_xbitinfo_x
Speed: 0.13512945175170898 sec
Size: 2.8G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_zstd_xbitinfo_round_7
Speed: 0.14370512962341309 sec
Size: 8.9G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_blosc2_xbitinfo_t4_round13
Speed: 0.15413403511047363 sec
Size: 6.1G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_blosc2_zstd_xbitinfo_t12_round13
Speed: 0.16738128662109375 sec
Size: 4.0G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_uint16_blosc2_zstd_xbitinfo_t4_round0
Speed: 0.18398380279541016 sec
Size: 354M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_zfp_tol2_xbitinfo_round7
Speed: 0.18632292747497559 sec
Size: 4.0G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_fp16_blosc2_zstd_xbitinfo_t4_round0
Speed: 0.19820499420166016 sec
Size: 8.9G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_blosc2_xbitinfo_t12_round13
Speed: 0.22356891632080078 sec
Size: 354M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_zfp_tol2_xbitinfo_round11
Speed: 0.4651339054107666 sec
Size: 63M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_16_jpeg_xl_xbitinfo_variable_spaced_5
Speed: 0.5068938732147217 sec
Size: 4.4G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_8_zstd_xbitinfo_t4_c11
Speed: 0.6310272216796875 sec
Size: 2.8G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_8_zstd_xbitinfo_t12_c1
Speed: 0.9104156494140625 sec
Size: 95M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_jpeg_xl_xbitinfo_variable
Speed: 0.9182889461517334 sec
Size: 364M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_jpeg_xl_xbitinfo_time_spaced_5
Speed: 1.0254340171813965 sec
Size: 5.5G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_16_zstd_xbitinfo_round_7
Speed: 1.143749713897705 sec
Size: 503M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_jpeg_xl_xbitinfo_x_spaced_5
Speed: 2.5826869010925293 sec
Size: 4.4G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_8_zstd_xbitinfo_t12_c11
Speed: 2.6254141330718994 sec
Size: 2.5G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_jpeg-xl_xbitinfo_round_13
Speed: 2.673764944076538 sec
Size: 2.8G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_8_zstd_xbitinfo_t4_c1
Speed: 2.770055055618286 sec
Size: 2.0G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_jpeg-xl_xbitinfo_round_11
Speed: 3.2815258502960205 sec
Size: 1.4G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_jpeg-xl_xbitinfo_round7
Speed: 3.2975013256073 sec
Size: 5.8G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_32_zstd_xbitinfo_round_7
Speed: 3.4220352172851562 sec
Size: 2.9G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_jpeg-xl_xbitinfo_round11
Speed: 3.773158073425293 sec
Size: 4.0G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_32_jpeg-xl_xbitinfo_round13
Speed: 4.655216217041016 sec
Size: 13G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_32_zstd_xbitinfo_round_13
Speed: 10.474880695343018 sec
Size: 1018M
Name: /home/jacob/Development/Satip/scripts/test_zarrs/hrv_16_jpeg-xl_xbitinfo_round_7
Speed: 29.518927812576294 sec
Size: 5.7G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_32_jpeg-xl_xbitinfo_round_13
Speed: 30.123006105422974 sec
Size: 4.3G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_32_jpeg-xl_xbitinfo_round_11
Speed: 45.913793325424194 sec
Size: 1.9G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_16_jpeg-xl_xbitinfo_round_7
Speed: 48.57991552352905 sec
Size: 3.4G
Name: /home/jacob/Development/Satip/scripts/test_zarrs/nonhrv_16_jpeg-xl_xbitinfo_round_10
The non-HRV is also being created again on Donatello, as the lossless data is quite large, so the nonHRV values might change a bit (probably a bit larger and slower to access going by the differences between the ones created from lossess HRV and the lossy one)
Im strugglering a bit with lots of numbers here. Could we get something that summarizes shows the difference in speed and size with the diffenent methods:
Something like this for me would be very useful
Compressions | Size | Type | Speed |
---|---|---|---|
Current | TODO | TODO | TODO |
JPEG-XL | 0.8 GB | TODO | TODO |
BZ2 | 2 GB | int8 | TODO |
Yep, can do that
feel free to add other rows if needed
Here it is for the HRV ones, seems like I might need to try ZFP a bit more. All of them are faster than the current compression at least.
Compression | Speed | Size | Type | Bitround | Timestep Chunk | Channel Chunk |
---|---|---|---|---|---|---|
JPEG-XL | 3.28152 | 1.4GB | FP32 | 7 | 1 | 1 |
JPEG-XL | 3.42203 | 2.9GB | FP32 | 11 | 1 | 1 |
JPEG-XL | 3.77315 | 4.0GB | FP32 | 13 | 1 | 1 |
Blosc2 ZSTD | 0.07168 | 4GB | Uint16 | None | 12 | 1 |
Blosc2 ZSTD | 0.07943 | 3.9GB | FP16 | None | 12 | 1 |
ZSTD | 0.10940 | 6.5GB | FP32 | 13 | 12 | 1 |
Blosc2 ZSTD | 0.11301 | 6.1GB | FP32 | 13 | 4 | 1 |
ZSTD | 0.13513 | 2.8GB | FP32 | 7 | 4 | 1 |
Blosc2 | 0.143705 | 8.9GB | FP32 | 13 | 4 | 1 |
Blosc2 | 0.154134 | 6.1GB | FP32 | 13 | 12 | 1 |
Blosc2 ZSTD | 0.167381 | 4.0GB | Uint16 | None | 4 | 1 |
Blosc2 ZSTD | 0.186322 | 4.0GB | FP16 | None | 4 | 1 |
ZFP Tolerance=2 | 0.223568 | 354MB | FP32 | 11 | 4 | 1 |
Current | 4.399742 | int16 | None | 1 | 1 |
The ZFP is the most lossy, so would need to check more how much error there is, but these ones did all return non-NaN results, so actually did return data
ZFP actually seems to be making everything the same value, not really sure why, so disregarding that now then. In lossless mode, ZFP creates huge (>30GB) files for this single day, so spacewise isn't really feasible Edit: I think its the tolerance, it can round things to the same value I guess?
More ones for just Blosc2 ZSTD compression
Compression | Speed | Size | Type | Bitround | Timestep Chunk | clevel |
---|---|---|---|---|---|---|
Blosc Zstd | 0.072362 | 825MB | Uint8 | None | 12 | 5 |
Blosc Zstd | 0.18395 | 13GB | FP32 | None | 4 | 5 |
Blosc Zstd | 0.273139 | 12GB | FP32 | None | 4 | 9 |
Blosc Zstd | 0.336419 | 5.9GB | FP32 | 13 | 4 | 9 |
Blosc Zstd | 0.398397 | 4.0GB | FP16 | None | 12 | 5 |
Blosc Zstd | 0.59300 | 827MB | Uint8 | None | 4 | 5 |
Blosc Zstd | 0.81063 | 4.1GB | Uint16 | None | 12 | 9 |
Blosc Zstd | 1.2086 | 6.2GB | FP32 | 13 | 4 | 5 |
Blosc Zstd | 1.240698 | 6.2GB | FP32 | 13 | 12 | 5 |
Blosc Zstd | 1.69135 | 4.1GB | Uint16 | None | 4 | 9 |
Blosc Zstd | 1.86772 | 12GB | FP32 | None | 12 | 9 |
Blosc Zstd | 1.970765 | 4.0GB | FP16 | None | 4 | 5 |
Blosc Zstd | 2.0007 | 5.9GB | FP32 | 13 | 12 | 9 |
Blosc Zstd | 2.53900 | 13GB | FP32 | None | 12 | 5 |
Another set of outputs from Blosc2 ZSTD testing, this time removing any results where all the values are the same (indicating that the compression is compressing too much and making the data useless) or any NaNs are in the data (as the data loaded shouldn't have any NaNs)
Compression | Speed | Size | Type | Bitround | Timestep Chunk | clevel |
---|---|---|---|---|---|---|
Blosc Zstd | 0.079500 | 4.0GB | FP16 | None | 12 | 5 |
Blosc Zstd | 0.087661 | 3.9GB | Uint16 | None | 12 | 9 |
Blosc Zstd | 0.091952 | 3.7GB | FP16 | None | 12 | 9 |
Blosc Zstd | 0.13554 | 5.0GB | FP32 | 11 | 4 | 5 |
Blosc Zstd | 0.136107 | 4.8GB | FP32 | 11 | 4 | 8 |
Blosc Zstd | 0.145195 | 13GB | FP32 | None | 4 | 5 |
Blosc Zstd | 0.146209 | 12GB | FP32 | None | 4 | 9 |
Blosc Zstd | 0.160394 | 4.8GB | FP32 | 11 | 12 | 8 |
Blosc Zstd | 0.167418 | 4.5GB | FP32 | 11 | 12 | 9 |
Blosc Zstd | 0.16927 | 4.0GB | FP16 | None | 4 | 5 |
Blosc Zstd | 0.18170 | 6.2GB | FP32 | 13 | 12 | 5 |
Blosc Zstd | 0.20448 | 12GB | FP32 | None | 12 | 9 |
Blosc Zstd | 0.20723 | 5.0GB | FP32 | 11 | 12 | 5 |
Blosc Zstd | 0.21482 | 3.7GB | FP16 | None | 4 | 9 |
I think the best option from this is the FP16, 12 timestep chunk compression. clevel=5 would result in a 10.5TB HRV 8-year dataset, while clevel=9 would give a 9.75TB dataset. clevel=5 is a lot faster to write and roughly 14% faster to read, so the small tradeoff in size is probably worth it. The space difference might be more pronounced in the non-HRV data though, so do need to check that when can..
The HRV and non-HRV Zarrs are being created now, using Donatello to create them on Leonardo. They are being built using this script: https://github.com/openclimatefix/Satip/blob/main/scripts/read_and_combine_satellite.py saving them out as FP16, clevel 5, 12 timesteps per chunk. Each chunk is between 70-140MB on disk for the non-HRV. Each Zarr will cover a single year of data.
Because there are missing timesteps, the most likely next step after creating these ones is to download and append the missing timesteps to the zarrs. This might mean, though, that the 12 timestep chunks contain timesteps that are not near each other, and so might require one more creation pass where, after the missing data is added and the timesteps sorted in order, a new zarr is created.
Describe the bug Loading one example fo the satellite is slow
To Reproduce
Expected behavior Should happen in < 1 seconds
Additional context Chunk size is