openradar / xradar

A tool to work in weather radar data in xarray
https://docs.openradarscience.org/projects/xradar
MIT License
85 stars 17 forks source link

possible bug with angle reindexing (?) #112

Closed aladinor closed 1 year ago

aladinor commented 1 year ago

Hi everyone,

I am trying to apply the angle reindexing function to my dataset as is explained here. the radar data I am analyzing is 360 degrees in azimuth as follows

import fsspec
import xradar as xd
import xarray as xr
from pandas import to_datetime
from datetime import datetime

def create_query(date, radar_site):
    """
    Creates a string for quering the IDEAM radar files stored in AWS bucket
    :param date: date to be queried. e.g datetime(2021, 10, 3, 12). Datetime python object
    :param radar_site: radar site e.g. Guaviare
    :return: string with a IDEAM radar bucket format
    """
    if (date.hour != 0) and (date.hour != 0):
        return f'l2_data/{date:%Y}/{date:%m}/{date:%d}/{radar_site}/{radar_site[:3].upper()}{date:%y%m%d%H}'
    elif (date.hour != 0) and (date.hour == 0):
        return f'l2_data/{date:%Y}/{date:%m}/{date:%d}/{radar_site}/{radar_site[:3].upper()}{date:%y%m%d}'
    else:
        return f'l2_data/{date:%Y}/{date:%m}/{date:%d}/{radar_site}/{radar_site[:3].upper()}{date:%y%m%d}'

def data_accessor(file):
    """
    Open AWS S3 file(s), which can be resolved locally by file caching
    """
    return fsspec.open_local(f'simplecache::s3://{file}', s3={'anon': True}, filecache={'cache_storage': '/tmp/radar/'})

date_query = datetime(2023, 4, 7, 3)
radar_name = "Barrancabermeja" 
query = create_query(date=date_query, radar_site=radar_name)
str_bucket = 's3://s3-radaresideam/'
fs = fsspec.filesystem("s3", anon=True)
radar_files = sorted(fs.glob(f"{str_bucket}{query}*"))
radar_dt = xd.io.open_iris_datatree(data_accessor(radar_files[1])).xradar.georeference()
print(radar_dt.sweep_0)
DataTree('sweep_0', parent="root")
    Dimensions:            (azimuth: 360, range: 747)
    Coordinates:
      * azimuth            (azimuth) float64 0.05493 1.057 2.038 ... 358.1 359.0
        elevation          (azimuth) float64 1.472 1.472 1.472 ... 1.472 1.472 1.472
        time               (azimuth) datetime64

The datatree object is 360 angles in azimuth. Then, I try to reindex the azimuth angles using the fix_angle function

def fix_angle(ds):
    angle_dict = xd.util.extract_angle_parameters(ds)
    display(angle_dict)
    start_ang = angle_dict["start_angle"]
    stop_ang = angle_dict["stop_angle"]
    angle_res = angle_dict["angle_res"]
    direction = angle_dict["direction"]

    # first find exact duplicates and remove
    ds = xd.util.remove_duplicate_rays(ds)

    # second reindex according to retrieved parameters
    ds = xd.util.reindex_angle(
        ds, start_ang, stop_ang, angle_res, direction, method="nearest"
    )
    return ds

The result looks like this,


ds2 = fix_angle(radar_dt.sweep_0)

{'first_angle': 'azimuth',
 'second_angle': 'elevation',
 'min_angle': array(0.05493164),
 'max_angle': array(359.04968262),
 'min_time': numpy.datetime64('2023-04-07T03:01:08.569000000'),
 'max_time': numpy.datetime64('2023-04-07T03:01:28.569000000'),
 'angles_are_unique': True,
 'times_are_unique': False,
 'a1gate_idx': array(247),
 'a1gate_val': array(247.03308105),
 'uniform_angle_spacing': False,
 'ascending': array(True),
 'direction': 1,
 'angle_res': array(1.01),
 'start_angle': 0,
 'stop_angle': 360,
 'expected_angle_span': 360,
 'missing_rays': array(False),
 'excess_rays': array(True),
 'expected_number_rays': 356}

Now our new xradar object has 356 angles in the azimuth dimension

DataTree('sweep_0', parent=None)
    Dimensions:            (azimuth: 356, range: 747)
    Coordinates:
      * azimuth            (azimuth) float64 0.505 1.515 2.525 ... 357.0 358.0 359.1
      * range              (range) float32 1e+03 1.3e+03 ... 2.245e+05 2.248e+05
        elevation          (azimuth) float64 1.472 1.472 1.472 ... 1.472 1.472 1.472
        time               (azimuth) datetime64[ns] 2023-04-07T03:01:14.569000 .....
        longitude          float64 -73.76
        latitude           float64 6.933
        altitude           float64 105.0

Digging into util.py seems like in the line #L291 the angle resolution is affected by the number of decimals of the np.round method.

'angle_res': array(1.01)

If we change the number of decimals to 1 the output will look like this:

{'first_angle': 'azimuth',
 'second_angle': 'elevation',
 'min_angle': array(0.05493164),
 'max_angle': array(359.04968262),
 'min_time': numpy.datetime64('2023-04-07T03:01:08.569000000'),
 'max_time': numpy.datetime64('2023-04-07T03:01:28.569000000'),
 'angles_are_unique': True,
 'times_are_unique': False,
 'a1gate_idx': array(247),
 'a1gate_val': array(247.03308105),
 'uniform_angle_spacing': False,
 'ascending': array(True),
 'direction': 1,
 'angle_res': array(1.),
 'start_angle': 0,
 'stop_angle': 360,
 'expected_angle_span': 360,
 'missing_rays': array(False),
 'excess_rays': array(False),
 'expected_number_rays': 360}

DataTree('sweep_0', parent=None)
    Dimensions:            (azimuth: 360, range: 747)
    Coordinates:
      * azimuth            (azimuth) float64 0.5 1.5 2.5 3.5 ... 357.5 358.5 359.5
      * range              (range) float32 1e+03 1.3e+03 ... 2.245e+05 2.248e+05
        elevation          (azimuth) float64

Please let me know your thoughts.

kmuehlbauer commented 1 year ago

@aladinor Yes, that's a bug. @JulianGiles found this behaviour on iris data, too. We will add some context the next days.

But you're already quite close. It's the jittering angles which makes the current retrieval of angle spacing less robust.

kmuehlbauer commented 1 year ago

So, there is no good solution to this issue. The only thing we can do is to remove/adapt the one quantile thing here:

https://github.com/openradar/xradar/blob/e464ac7d8de09bd70c2c7b719175c4c93648feb7/xradar/util.py#L275-L277

The reasoning behind the current setup was to remove glitches (double rays) with very small angle differences. But this is worse for datasets without these small differences angles, since it shifts the median to higher differences. I'm thinking about having a quantile-kwarg to remove differences below and above a certain treshold before calculation of median.

Nevertheless, if the angle differences are jittering too much, we can't do anything about it. Here I'm thinking about checking std/var to get an impression if the provided angles are spread to wide and issue a warning to the user.

@aladinor @mgrover1 I'd like to get your thoughts on that, too.

kmuehlbauer commented 1 year ago

Here is how I would do it now:

import numpy as np
import xarray as xr
import xradar as xd
seed = 12345
rng = np.random.default_rng(seed)

#noise = rng.normal(scale=0.11, size=360)
noise = rng.uniform(low=-0.25, high=0.25, size=360)
print(f"Noise added, min: {min(noise)}, max: {max(noise)}")

azimuth = np.arange(0.5, 360., 1.0) + noise
ds = xd.model.create_sweep_dataset(
    a1gate=0, direction=1, azimuth=azimuth
)
ds = ds.swap_dims({"time": "azimuth"})
ds = ds.sortby("azimuth")
ds = ds.assign_coords(sweep_mode="azimuth_surveillance")

# remove block of rays
ds_in = xr.concat(
        [
            ds.isel(azimuth=slice(None, 100)),
            ds.isel(azimuth=slice(101, None)),
        ],
        "azimuth",
        data_vars="minimal",
    )
print(ds)

# extract angle resolution
first_angle = "azimuth"
fdim = ds_in[first_angle]
diff = fdim.diff(first_angle)
#print(diff.values)

# calculate std/median
std = diff.std(skipna=True)
median = diff.median(skipna=True)
print(f"STD: {std.values}, MEDIAN: {median.values}")
# remove values above and below std (centered around median), calculate median
diff_cutted = diff.where((diff >= (median - std)) & (diff <= (median + std)))
print(diff_cutted.values)
median_diff = diff_cutted.median(skipna=True)
# angle res rounding to two decimals (should work for almost all cases)
angle_res = np.round(median_diff, decimals=2)                   
print(f"MEDIAN: {median_diff.values}, ANGLE_RES: {angle_res.values}")
kmuehlbauer commented 1 year ago

@aladinor I've added a fix in #118 which should work for most of the use cases. Only for very random angle data this is starting to fail.

aladinor commented 1 year ago

Hi @kmuehlbauer. Thanks for looking at this issue. I will test it when it is merged.