pangeo-data / pangeo-astro-examples

Binder for astronomy stuff on pangeo
2 stars 4 forks source link

Update for Dask Gateway #4

Closed TomAugspurger closed 3 years ago

TomAugspurger commented 4 years ago

I've updated the notebooks / binder env to work with our new setup.

@guillaumeeb or @wtbarnes are you able to help with some usage things?

  1. The _repr_html_() of sunpy.map.sources.sdo.AIAMap invokes an expensive computation. Just a plain repr doesn't. At a guess, the handling of NaN / inf values is to blame but I haven't checked
        # Handle bad values (infinite and NaN) in the data array
        finite_data = self.data[np.isfinite(self.data)]
        count_nan = np.isnan(self.data).sum()
        count_inf = np.isinf(self.data).sum()

(by itself that doesn't trigger computation, but something like if count_nan > 0 will.

  1. Exception in AIACube.__init__. There's an assertion that the shapes of all the submaps match. I haven't looked at why, but they differ
cubes = [AIACube([Map(m.data/m.meta['exptime'], m.meta) for m in c.maps[::10]]) for c in cubes]
cubes2 = [c.submap(blc, trc) for c in cubes]  # this raises
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-24-6344e4fc4886> in <module>
      1 cubes2 = []
      2 for c in cubes:
----> 3     cubes2.append(c.submap(blc, trc))

<ipython-input-2-01e4eec2dee6> in submap(self, *args, **kwargs)
    100 
    101     def submap(self, *args, **kwargs):
--> 102         return AIACube([m.submap(*args, **kwargs) for m in self.maps])
    103 
    104 

<ipython-input-2-01e4eec2dee6> in __init__(self, maps)
     34     def __init__(self, maps):
     35         if not all([m.data.shape == maps[0].data.shape for m in maps]):
---> 36             raise ValueError('All maps must have same dimensions')
     37         if not all([m.data.dtype == maps[0].data.dtype for m in maps]):
     38             raise ValueError('All maps must have same dtype')

ValueError: All maps must have same dimensions

Looking at cubes[0] the shapes after the submap are

>>> set([x.submap(blc, trc).data.shape for x in cubes[0].maps])
{(667, 667), (667, 668), (668, 666), (668, 667), (668, 668)}
  1. Warnings from SunPy about keyword-only arguments.
WARNING:sunpy:SunpyDeprecationWarning: Pass top_right=<SkyCoord (Helioprojective: obstime=2012-09-23T03:00:11.000, rsun=696000000.0 m, observer=<HeliographicStonyhurst Coordinate (obstime=2012-09-23T03:00:11.000): (lon, lat, radius) in (deg, deg, m)
    (-0.1327114, 7.03427967, 1.50102758e+11)>): (Tx, Ty) in arcsec
    (-200., 250.)> as keyword args. From version 2.1 passing these as positional arguments will result in an error.

As a side-note this submap operation seems to take a while for all the .maps. I wonder if it could be done in parallel on this cluster?

TomAugspurger commented 4 years ago

Sorry forgot to include a binder URL: https://binder.pangeo.io/v2/gh/TomAugspurger/pangeo-astro-examples/gateway

guillaumeeb commented 4 years ago

Hey thanks @TomAugspurger! Thats great!

Unfortunatly I won't be able to help on the usage questions...

wtbarnes commented 4 years ago

Thanks very much for reviving this @TomAugspurger! I'm happy to help with this.

At the moment, I'm having issues getting the notebook to start. Binder seems to be able to create the image and pops me into a Jupyterlab session, but there's something strange going on with the actual notebook kernel. Executing any cell just hangs indefinitely.

Either way, I'll try to respond to the above points for now with hopefully more concrete answers once I can successfully run the notebook. I think nearly all of these issues are due to the fact that this notebook was developed quite some time ago with a different major release version of sunpy and thus there are some rough edges now that we've released v2.0.1

  1. __repr_html__ does some fancy stuff to give you a JS widget of the map in a couple different views. That this is slow for out-of-core data is not too surprising. I'll investigate this more.
  2. Since v2.0, we've slightly changed the logic of cropping with coordinates. These off-by-one differences are not too surprising. I think this can just be solved by cropping in pixel coordinates rather than world coordinates.
  3. Another post-2.0 fix. This just means the submap call should be m.submap(blc,top_right=trc). Doing the submap operations in parallel sounds great!

I will try to play around with this soon and offer some more concrete fixes. Thanks again!

TomAugspurger commented 4 years ago

Thanks! It's strange that you're having issues executing cells. I just got a new session and things seem fine.

Sometimes when this happens restarting the notebook kernel and then re-selecting the Python 3 kernel can fix it.

On Tue, Sep 1, 2020 at 10:39 AM Will Barnes notifications@github.com wrote:

Thanks very much for reviving this @TomAugspurger https://github.com/TomAugspurger! I'm happy to help with this.

At the moment, I'm having issues getting the notebook to start. Binder seems to be able to create the image and pops me into a Jupyterlab session, but there's something strange going on with the actual notebook kernel. Executing any cell just hangs indefinitely.

Either way, I'll try to respond to the above points for now with hopefully more concrete answers once I can successfully run the notebook. I think nearly all of these issues are due to the fact that this notebook was developed quite some time ago with a different major release version of sunpy and thus there are some rough edges now that we've released v2.0.1

  1. __repr_html__ https://github.com/sunpy/sunpy/blob/b39723b60145f35ba57923301a7acc4bf0d7333c/sunpy/map/mapbase.py#L260 does some fancy stuff to give you a JS widget of the map in a couple different views. That this is slow for out-of-core data is not too surprising. I'll investigate this more.
  2. Since v2.0, we've slightly changed the logic of cropping with coordinates. These off-by-one differences are not too surprising. I think this can just be solved by cropping in pixel coordinates rather than world coordinates.
  3. Another post-2.0 fix. This just means the submap call should be m.submap(blc,top_right=trc). Doing the submap operations in parallel sounds great!

I will try to play around with this soon and offer some more concrete fixes. Thanks again!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/pangeo-astro-examples/pull/4#issuecomment-684945722, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOITZTOD2HDFF3S4FVV3SDUIRTANCNFSM4QQVNVXQ .

TomAugspurger commented 3 years ago

I updated this PR to just include the necessary updates for dask-gateway.

There are a few warnings and I had to pin astropy and sunpy to older versions. It'd be nice to get this into pangeo-gallery, but at least it's running again.

TomAugspurger commented 3 years ago

Ah I apparently don't have write access here. @rabernat or at @jhamman could you do the honors?

guillaumeeb commented 3 years ago

Thanks so much Tom!

wtbarnes commented 3 years ago

Thanks very much for updating this!