i4Ds / Karabo-Pipeline

The Karabo Pipeline can be used as Digital Twin for SKA
https://i4ds.github.io/Karabo-Pipeline/
MIT License
11 stars 4 forks source link

allow xarray > 2023.2 #542

Open Lukas113 opened 8 months ago

Lukas113 commented 8 months ago

For the imaging part of Karabo while running source_detection.ipynb, I've encountered the following error with xarray 2023.11 (see traceback). When I downgraded xarray to 2023.2 (release before blame commit to raise ValueError), there was no such error and everything worked fine. Because this error is caused through RASCIL and ska_sdp_func_python (see traceback), I think fixing this error is out of my control. If my assessment is correct, when should fix it as follows:

First, just constrain xarray to be <= 2023.2. However, this also constrains pandas to be < 2 (conda-resolver said so). And because these two libraries are widely used libraries, we should address this issue further. However, this might require some effort. Firstly, because RASCIL hasn't contrained ska_sdp_func_python & ska_sdp_func versions, I'm concerned that the latest release of RASCIL 1.1.0 (April 2023) might not be compatible with the (hopefully fixed) major release of ska-sdp-func 1.0 (November 2023). In addition, we also need to build the according conda-wheels after testing.

Long story short: As a hotfix, I'll constrain xarray <= 2023.2 (and accept the pandas downgrade for the moment). The best-case scenario would be that RASCIL soon does a new release, where we could fix the according ska-sdp deps. I don't think that building a dirty conda-wheel (10 months after the last release) of RASCIL is a good solution. If there's a better suggestion, I'd like to hear it.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[14], line 6
      2 imager_askap.ingest_vis_nchan = 16
      4 # Try differnet algorithm
      5 # More sources
----> 6 deconvolved, restored, residual = imager_askap.imaging_rascil(
      7     clean_nmajor=1,
      8     clean_algorithm="mmclean",
      9     clean_scales=[10, 30, 60],
     10     clean_fractional_threshold=0.3,
     11     clean_threshold=0.12e-3,
     12     clean_nmoment=5,
     13     clean_psf_support=640,
     14     clean_restored_output="integrated",
     15     use_dask=True,
     16 )

File ~/i4ds/ska/Karabo-Pipeline/karabo/imaging/imager.py:432, in Imager.imaging_rascil(self, deconvolved_fits_path, restored_fits_path, residual_fits_path, client, use_dask, n_threads, use_cuda, img_context, clean_algorithm, clean_beam, clean_scales, clean_nmoment, clean_nmajor, clean_niter, clean_psf_support, clean_gain, clean_threshold, clean_component_threshold, clean_component_method, clean_fractional_threshold, clean_facets, clean_overlap, clean_taper, clean_restore_facets, clean_restore_overlap, clean_restore_taper, clean_restored_output)
    388 models = [
    389     rsexecute.execute(create_image_from_visibility)(
    390         bvis,
   (...)
    397     for bvis in blockviss
    398 ]
    400 result = continuum_imaging_skymodel_list_rsexecute_workflow(
    401     vis_list=blockviss,
    402     model_imagelist=models,
   (...)
    429     imaging_uvmin=self.imaging_uvmin,
    430 )
--> 432 result = rsexecute.compute(result, sync=True)
    434 residual, restored, skymodel = result
    436 deconvolved = [sm.image for sm in skymodel]

File ~/miniconda3/envs/karabo_dev_env/lib/python3.9/site-packages/rascil/workflows/rsexecute/execution_support/rsexecute.py:311, in _rsexecutebase.compute(self, value, sync)
    309 except:
    310     pass
--> 311 future = self.client.compute(value, sync=sync)
    312 wait(future)
    313 if self._verbose:

File ~/miniconda3/envs/karabo_dev_env/lib/python3.9/site-packages/distributed/client.py:3495, in Client.compute(self, collections, sync, optimize_graph, workers, allow_other_workers, resources, retries, priority, fifo_timeout, actors, traverse, **kwargs)
   3492         futures.append(arg)
   3494 if sync:
-> 3495     result = self.gather(futures)
   3496 else:
   3497     result = futures

File ~/miniconda3/envs/karabo_dev_env/lib/python3.9/site-packages/distributed/client.py:2383, in Client.gather(self, futures, errors, direct, asynchronous)
   2380     local_worker = None
   2382 with shorten_traceback():
-> 2383     return self.sync(
   2384         self._gather,
   2385         futures,
   2386         errors=errors,
   2387         direct=direct,
   2388         local_worker=local_worker,
   2389         asynchronous=asynchronous,
   2390     )

File ~/miniconda3/envs/karabo_dev_env/lib/python3.9/site-packages/rascil/workflows/rsexecute/skymodel/skymodel_rsexecute.py:113, in skymodel_restore()
    112 def skymodel_restore(s, res, cb):
--> 113     res_image = restore_cube(s.image, residual=res, clean_beam=cb)
    114     return restore_skycomponent(res_image, s.components, cb)

File ~/miniconda3/envs/karabo_dev_env/lib/python3.9/site-packages/ska_sdp_func_python/image/deconvolution.py:1179, in restore_cube()
   1163 def restore_cube(
   1164     model: Image, psf=None, residual=None, clean_beam=None
   1165 ) -> Image:
   1166     """Restore the model image to the residuals.
   1167 
   1168     The clean beam can be specified as a dictionary with
   (...)
   1177 
   1178     """
-> 1179     model_list = image_scatter_channels(model)
   1180     residual_list = image_scatter_channels(residual)
   1181     psf_list = image_scatter_channels(psf)

File ~/miniconda3/envs/karabo_dev_env/lib/python3.9/site-packages/ska_sdp_func_python/image/gather_scatter.py:184, in image_scatter_channels()
    179 if im is None:
    180     return None
    182 return [
    183     r[1]
--> 184     for r in im.groupby_bins("frequency", bins=subimages, squeeze=False)
    185 ]

File ~/miniconda3/envs/karabo_dev_env/lib/python3.9/site-packages/xarray/core/dataset.py:10279, in groupby_bins()
  10271 from xarray.core.groupby import (
  10272     BinGrouper,
  10273     DatasetGroupBy,
  10274     ResolvedBinGrouper,
  10275     _validate_groupby_squeeze,
  10276 )
  10278 _validate_groupby_squeeze(squeeze)
> 10279 grouper = BinGrouper(
  10280     bins=bins,
  10281     cut_kwargs={
  10282         "right": right,
  10283         "labels": labels,
  10284         "precision": precision,
  10285         "include_lowest": include_lowest,
  10286     },
  10287 )
  10288 rgrouper = ResolvedBinGrouper(grouper, group, self)
  10290 return DatasetGroupBy(
  10291     self,
  10292     (rgrouper,),
  10293     squeeze=squeeze,
  10294     restore_coord_dims=restore_coord_dims,
  10295 )

File <string>:5, in __init__()

File ~/miniconda3/envs/karabo_dev_env/lib/python3.9/site-packages/xarray/core/groupby.py:602, in __post_init__()
    600 def __post_init__(self) -> None:
    601     if duck_array_ops.isnull(self.bins).all():
--> 602         raise ValueError("All bin edges are NaN.")

ValueError: All bin edges are NaN.
Lukas113 commented 8 months ago

A minor update: I set the xarray-constraint on the Feedstock because it's an issue from ska-sdp-func-python and not Karabo. In addition, I removed the build-nr fixings to enable constaint-updates. Otherwise, each legacy karabo installation would be corrupt if a feedstock-wheel needs an update.

The hotfix is finished, as soon as there is an updated build of ska-sdp-func-python available with a larger build-nr. However, I'll try to create a reproducible test-case to catch this error.