tobac-project / tobac

Tracking and object-based analysis of clouds
BSD 3-Clause "New" or "Revised" License
93 stars 51 forks source link

Floating point error in feature detection breaks interpn #433

Open travis-j-hahn opened 1 week ago

travis-j-hahn commented 1 week ago

It looks like there is a chance during the feature detection step for the identification of cells on boundary edges which result in hdim1 or hdim2 values slightly outside of the maximum boundary. I.e. the feature detection can create a cell with ix value of 218.00000003 when the maximum index is 218, and this throws an error in tobac/general.py:107 with coordinate_points = interpn(points, values, xi) raising ValueError: One of the requested xi is out of bounds in dimension 0

Full Error:

Traceback (most recent call last):

  Cell In[282], line 1
    wrf_features = wrf_tobac_feature_id(wrf_cube, 'IC', CONFIG)

  File ~/CoMET/CoMET/wrf_tobac.py:40 in wrf_tobac_feature_id
    wrf_radar_features = tobac.feature_detection.feature_detection_multithreshold(cube, dxy=dxy, **CONFIG['wrf']['tobac']['feature_id'])

  File ~/.conda/envs/CoMET/lib/python3.11/site-packages/tobac/utils/decorators.py:271 in wrapper
    output = func(*args, **kwargs)

  File ~/.conda/envs/CoMET/lib/python3.11/site-packages/tobac/feature_detection.py:1409 in feature_detection_multithreshold
    features = add_coordinates(features, field_in)

  File ~/.conda/envs/CoMET/lib/python3.11/site-packages/tobac/utils/general.py:109 in add_coordinates
    coordinate_points = interpn(points, values, xi)

  File ~/.conda/envs/CoMET/lib/python3.11/site-packages/scipy/interpolate/_rgi.py:740 in interpn
    raise ValueError("One of the requested xi is out of bounds "

ValueError: One of the requested xi is out of bounds in dimension 0

This can be fixed by using np.clip to ensure there is no overflow:


elif variable_cube.coord(coord).ndim == 2:
            if variable_cube.coord_dims(coord) == (hdim_1, hdim_2):
                points = (dimvec_1, dimvec_2)
                values = variable_cube.coord(coord).points
                temp_hdim1 = np.clip(t['hdim_1'].values, 0, points[0].max())
                temp_hdim2 = np.clip(t['hdim_2'].values, 0, points[1].max())
                xi = np.column_stack((temp_hdim1, temp_hdim2))
                coordinate_points = interpn(points, values, xi)
w-k-jones commented 1 week ago

Hi @travis-j-hahn , thanks for raising this issue. Do you have a minimal example that can recreate this error? (e.g. a subset of the data you were using). I'll have a look into this

travis-j-hahn commented 1 week ago

wrf_tb_tobac_test.zip Hi @w-k-jones, I've attached a zip with a small iris cube here containing the problematic values; it is only 2mb. Here are the feature detection parameters I used to see if you can recreate the issue:


threshold: [250,225,200,175,150]
target: 'minimum'
position_threshold: 'weighted_diff'
sigma_threshold: 0.5
n_min_threshold: 4

Then I used the snippet:

wrf_radar_features = tobac.feature_detection.feature_detection_multithreshold(test[0], dxy=1000, **params) where test[0] is the loaded iris cube. Hope this helps!

freemansw1 commented 1 week ago

Anecdotally, I have also seen this issue, and I agree with the proposed solution. Thanks much @travis-j-hahn

w-k-jones commented 1 week ago

Thanks for sending the test data, I've successfully recreated the error. One thing to watch out for with the fix is that with periodic boundary conditions there are expected situations where the feature point lies outside the range of the coordinate axis (e.g. a feature a 359.5 longitude when the coord ranges 0-359), which we should ensure is handled correctly

w-k-jones commented 1 week ago

@freemansw1 we will also have to update the new get_coordinates functions in #354

freemansw1 commented 1 week ago

@freemansw1 we will also have to update the new get_coordinates functions in #354

Yes those need fixing anyway. Good note.

freemansw1 commented 1 week ago

Thanks for sending the test data, I've successfully recreated the error. One thing to watch out for with the fix is that with periodic boundary conditions there are expected situations where the feature point lies outside the range of the coordinate axis (e.g. a feature a 359.5 longitude when the coord ranges 0-359), which we should ensure is handled correctly

Yes, I believe our expected out would be 359.5 there, but I'm not sure how add_coordinates handles this with PBCs, not the least of which including the changes in #354.

w-k-jones commented 1 week ago

Add coordinates doesn't currently handle PBCs at all 😬 I have no idea how we've avoided these errors before now...