Add Italian GPM data + workflow

kmuehlbauer commented 4 days ago

We did not make it before ERAD2024, but I've received 3 volumes of Italian radar data (thanks to Gianfranco) to extend our GPM Matching/Calibration.

@mgrover1 I'll check the data the next days and will contact you to put it on the pythia bucket. @ghiggi It would be great if you could extend/add the processing for this data, once it's available.

ghiggi commented 4 days ago

Yes. I can do it as soon as the data are available. Which xradar reader should be used to read the Italian data? I can also provide a list of all GPM overpass over Italy for who want to extend the analysis. With @aladinor we might provide a tutorial for the Columbian radars too ;)

kmuehlbauer commented 4 days ago

Which xradar reader should be used to read the Italian data?

It's in the datamet-format which @wolfidan implemented a reader just before ERAD2024.

import xradar as xd
fname = "somename.tar.gz"
tree = xd.io.open_datamet_datatree(fname)

We'll have to wait until @mgrover1 is back in office. But you can already play with the data. Thanks @ghiggi.

ghiggi commented 3 days ago

@kmuehlbauer I had a quicklook this morning, but there is something strange in the GR data. This is the GPM DPR surface reflectivity measured at 06:30 UTC. The cross indicates the position of the GR

If I look at the GR data, I see something that correlates to the shape of the storm at sweep higher than 6, but with reflectivities below 10 dBZ.

There is something going wrong with the decoding/scaling of the GR data I think @wolfidan ! If the scale_factor is defined as 1/current_scale_factor in the reader I obtain this which looks more realistic to me ;)

ghiggi commented 3 days ago

@kmuehlbauer for the tutorial we will need to filter out the clutter and some additional preprocessing of GR data before doing the comparison. Do we use pyart or wradlib? Do you have already some default/template processing to use?

kmuehlbauer commented 3 days ago

@ghiggi The radar obviously operates in top-down mode. So higher elevations are scanned first.

You are referring to these lines of code?

I'm not sure what the correct setting of offset and slope is, but if I use a negated offset (and keep the scale_factor), the output looks even more realistic. Can you confirm that @ghiggi?

For preprocessing examples I usually use clutter detection from wradlib to remove clutter and make some interpolation afterwards to fill the holes. But we can also use pyart for preprocessing. I've no fixed preprocessor pipeline, always build it from scratch for new radars.

ghiggi commented 3 days ago

I assumed C band radar, empirical correction of Tan for Ku to C, beamwidth of 1° ... plot some stuffs and I don't think it's a good overpass for assessing GR bias. Most volumes are above 35 dBZ, ground clutter, beam blockage, NUBF, ... I played a bit with filtering criteria but ...

ghiggi commented 3 days ago

@kmuehlbauer I was looking more at these lines. I see scale_factor and add_offset added to the attrs, but I found these in the encoding once I open the datatree. This means xarray/cf decoding is doing something and moving from attrs to encoding. Are we applying scaling twice? Here and using CF decoding? I can't have a look at it for the rest of the day ... I might be able to give a look tomorrow ...

kmuehlbauer commented 3 days ago

@ghiggi

You can deal with this the following:

import xarray as xr
swp = xr.open_dataset(fname, engine="datamet", group="sweep_11", decode_cf=False)
swp.DBZH.attrs["add_offset"] = swp.DBZH.attrs["add_offset"] * -1
swp = xr.decode_cf(swp)

add_offset and scale_factor are moved to the attributes to make them available for automatic decoding in xarray. This is what happens, but it looks like the offset needs to be the other way round.

Take your time, no rush.

ghiggi commented 3 days ago

Quickly tried and this is not correct:

import xarray as xr
swp = xr.open_dataset(fname, engine="datamet", group="sweep_11", decode_cf=False)
swp.DBZH.attrs["add_offset"] = swp.DBZH.attrs["add_offset"] * -1
swp = xr.decode_cf(swp)

If we open with decode_cf=False we get directly: If we do

    ds_gr["DBZH"].attrs["scale_factor"] = 1/ds_gr["DBZH"].attrs["scale_factor"] 
    ds_gr = xr.decode_cf(ds_gr)

we get this:

I think we are applying decoding twice ! By just reading with decode_cf=False and doing anything else I get this compared to GPM DPR

kmuehlbauer commented 3 days ago

Thanks @ghiggi, I'll have a deeper look later today.

kmuehlbauer commented 3 days ago

True @ghiggi, your hunch is correct! :clap: scale offset is applied twice, once in the reader and second time in xarray decoding.

https://github.com/openradar/xradar/blob/34a9e8234189f921b1586001c84bdaf8f80f4c48/xradar/io/backends/datamet.py#L182-L183

kmuehlbauer commented 3 days ago

@ghiggi FYI: https://github.com/openradar/xradar/pull/209

mgrover1 commented 1 day ago

@kmuehlbauer - checking in here now that I am back in Chicago! Is there a link to the data to upload?

mgrover1 commented 21 hours ago

@ghiggi @kmuehlbauer - the data is now available in the pythia bucket under the /radar/erad2024/gpm_api directory:

H-000-VOL-SERANO-202405210625.tar.gz
H-000-VOL-SERANO-202405210630.tar.gz
H-000-VOL-SERANO-202405210635.tar.gz

kmuehlbauer commented 11 hours ago

@ghiggi I've released xradar v0.6.5 with the needed fix. When using the binder, we`d have to create an new PR to re-create the docker image with latest xradar.

openradar / erad2024

Add Italian GPM data + workflow #112