Closed simonpf closed 10 months ago
It's not a big issue but good to leave it documented: If the output format Zarr is chosen and a file is re-processed and requested to be saved in the same path, the following is observed:
$ ccic process /mnt/data_copper2/ccic/models/ccic_v2.pckl cpcir /home/amell/tmp 2014-01-16T02:30:00 --roi 128.68 -12.45 132.46 -9.83 --targets tiwp tiwp_fpavg tiwc cloud_type cloud_prob_2d cloud_prob_3d --n_processes 1 --output_format zarr
Standard output:
/home/amell/pansat_ccic/pansat/pansat/time.py:19: UserWarning: Discarding nonzero nanoseconds in conversion.
return pd.Timestamp(time).to_pydatetime()
$ ccic process /mnt/data_copper2/ccic/models/ccic_v2.pckl cpcir /home/amell/tmp 2014-01-16T02:30:00 --roi 128.68 -12.45 132.46 -9.83 --targets tiwp tiwp_fpavg tiwc cloud_type cloud_prob_2d cloud_prob_3d --n_processes 1 --output_format zarr
Standard output:
/home/amell/pansat_ccic/pansat/pansat/time.py:19: UserWarning: Discarding nonzero nanoseconds in conversion.
return pd.Timestamp(time).to_pydatetime()
/home/amell/ccic/ccic/bin/process.py (ERROR ) :: path '' contains a group
Traceback (most recent call last):
File "/home/amell/ccic/ccic/bin/process.py", line 269, in write_output
data.to_zarr(output_path / output_file, encoding=encodings)
File "/home/amell/miniconda3/envs/ccic/lib/python3.8/site-packages/xarray/core/dataset.py", line 2068, in to_zarr
return to_zarr( # type: ignore
File "/home/amell/miniconda3/envs/ccic/lib/python3.8/site-packages/xarray/backends/api.py", line 1613, in to_zarr
zstore = backends.ZarrStore.open_group(
File "/home/amell/miniconda3/envs/ccic/lib/python3.8/site-packages/xarray/backends/zarr.py", line 409, in open_group
zarr_group = zarr.open_group(store, **open_kwargs)
File "/home/amell/miniconda3/envs/ccic/lib/python3.8/site-packages/zarr/hierarchy.py", line 1458, in open_group
raise ContainsGroupError(path)
zarr.errors.ContainsGroupError: path '' contains a group
This does not happen when choosing --output_format netcdf
.
I started the processing for global retrievals using the targets `tiwp` and `cloud_prob_2d` as well as the argument `--inpainted_mask`. The use of the later only represent an increment of about 1% in the file size for one full year. 2013, 2014, ~~2010~~, 2015 will be processed in this order.
Produce IWP retrievals. Time ranges where validation data is available should have priority.
Variables:
tiwp
,cloud_prob_2d
,inpainted