glue-viz / glue

Linked Data Visualizations Across Multiple Files
http://glueviz.org
Other
721 stars 152 forks source link

In some images, full-size masked subset can cause IndexError #2474

Open pllim opened 6 months ago

pllim commented 6 months ago

This is a follow-up of:

The offending example was removed in this PR but you can still reproduce it by using the removed example:

Minimally reproducible code using Jdaviz in a notebook; break this into different cells:

import tempfile

import numpy as np
from astropy import units as u
from astroquery.mast import Observations

from jdaviz import Imviz

imviz = Imviz()

data_dir = tempfile.gettempdir()

# JWST images with GWCS.
# They have vastly different pixel dimensions but point to roughly (but not exact) regions in the sky.
# The raw GWCS has strict bounding_box that would return NaNs but Imviz disables that.
files = ['jw02727-o002_t062_nircam_clear-f090w_i2d.fits',  # (4718, 4735)
         'jw02727-o002_t062_nircam_clear-f277w_i2d.fits',  # (2265, 2269)
]

for fn in files:
    uri = f"mast:JWST/product/{fn}"
    result = Observations.download_file(uri, local_path=f'{data_dir}/{fn}')
    imviz.load_data(f'{data_dir}/{fn}')

imviz.show()

# Only happens when linked by WCS.
# And this does NOT happen for our other example notebook with HST images using FITS WCS.
imviz.link_data(link_type='wcs')

data = imviz.get_data('jw02727-o002_t062_nircam_clear-f090w_i2d[DATA]')

# Numpy mask
idx = (np.array([350, 350, 350, 350, 350, 350, 351, 351, 351, 351, 352, 352, 352,
                 352, 352, 352, 352, 352, 352, 352, 353, 353, 353, 353, 353, 353,
                 353, 353, 353, 353, 353, 353, 354, 354, 354, 354, 354, 354, 354,
                 354, 355, 355, 355, 355, 355, 355, 355, 355, 356, 356, 356, 356,
                 356, 356, 356, 357, 357, 358, 358]),
       np.array([353, 354, 355, 356, 357, 358, 350, 352, 359, 361, 350, 352, 353,
                 354, 355, 356, 357, 358, 359, 361, 350, 351, 352, 353, 354, 355,
                 356, 357, 358, 359, 360, 361, 351, 352, 354, 355, 356, 357, 359,
                 360, 352, 353, 354, 355, 356, 357, 358, 359, 352, 353, 354, 355,
                 356, 357, 358, 353, 358, 352, 359]))
my_mask = np.zeros(data.data.shape, dtype=np.bool_)
my_mask[idx] = True

imviz.load_regions(my_mask)

viewer.offset_by(0.5 * u.arcsec, -1.5 * u.arcsec)  # IndexError as reported in spacetelescope/jdaviz#2638

When I peek into the values here:

https://github.com/glue-viz/glue/blob/5f38baf8d22aef572b3269149208089199c4512d/glue/core/subset.py#L1286

--- a/glue/core/subset.py
+++ b/glue/core/subset.py
@@ -1283,7 +1282,16 @@ class MaskSubsetState(SubsetState):

         # locate each element of data in the coordinate system of the mask
         vals = [data[c, view].astype(int) for c in self.cids]
-        result = self.mask[tuple(vals)]
+
+        try:
+            result = self.mask[tuple(vals)]
+        except Exception:
+            ooops = tuple(vals)
+            print(self.mask.shape)
+            print(data.label)
+            print(f"{ooops[0].max()}")
+            print(f"{ooops[1].max()}")
+            raise

I see this (reference data shape is (4718, 4735)):

(4718, 4735)
jw02727-o002_t062_nircam_clear-f277w_i2d[DATA]
4329
4735

I tried to weed out the offending vals but it is surprisingly hard to reformat stuff to a way that glue likes because it started to complain about dimension mismatch somewhere else when I did it.

I think int casting of pixel values cause the coordination transformation to drift and fail round-tripping, but I am unable to get this exact error using simple code snippets. There is small rounding error trying to roundtrip just using GWCS without glue but they are negligible, so the significant drift must be happening in glue somewhere.

🐱