Moving platforms average maps do not make sense

ewhelan commented 3 years ago

Averaging maps for moving platforms such as scatterometer do not make sense. Suggest to either remove option or carry out grid-averaging for moving platforms

From https://hirlam.org/trac/wiki/Meetings/HarmonieWorkingWeek/UseofObsandALGO202010/Day2Diagnostics

isabelmonteiro commented 3 years ago

I think it makes sense, if you use an appropriate grid to do averaging, and a time series long enough to have enough match ups in each grid point. We did this for o-f HARMONIE (against o=ScatSat), check fig att and the NWPSAF does this for o-b by routine for ECMWF and UKV, just check the link.https://nwp-saf.eumetsat.int/site/monitoring/winds-quality-evaluation/scatterometer-mon/scatt-monthly-monitoring/ Very nice if we could have this in obsmon, the hard part is already done1 WSMean_HA_OSCAT_U10m_F009_IBERIA_cy43_4DVAR7-22 txt

paulovcmedeiros commented 3 years ago

The averages are computed for each (lon, lat) coordinate separately. If a platform has position A at DTG X but moves to position B at DTG Y, then the average plot over DTGs X and Y will show observations for both positions A and B. Data for different positions are never mixed together.

I don't think that removing these plots for moving platforms is a good way to go. Once it is clear what the plots do, my opinion is that it is up to the user to decide whether or not to request these for any given platform. It is much simpler for the user not to select the plot then for us to keep a list of which platforms may or may not move their positions. If we did this, we'd need, for instance, to remember to always update such list every time a new observation type were added, which is very easy to overlook and thus quite error-prone.

A grid solution, as both of you pointed out, is indeed a much better option. However, obsmon is currently not aware of any domain & grid information at all, and making it grid-aware will certainly require more than just a couple of extra lines of code. Do you know anyone who'd be willing to help with this? Of course I'll do this myself eventually if no one steps in, but I can't prioritise this right now, as I'm involved in other projects.

paulovcmedeiros commented 3 years ago

By the way, I'm quite happy we've finally managed to make the code open-source and moved it here. I ~~believe~~ hope it will make it much easier to get people to collaborate. Besides, the issue tracking system here, integrated with the repo itself, makes everything more transparent for collaborators and users alike. So, thanks for using the issue tracking system and discussing things here.

isabelmonteiro commented 3 years ago

The averages are computed for each (lon, lat) coordinate separately. If a platform has position A at DTG X but moves to position B at DTG Y, then the average plot over DTGs X and Y will show observations for both positions A and B. Data for different positions are never mixed together.

I don't think that removing these plots for moving platforms is a good way to go. Once it is clear what the plots do, my opinion is that it is up to the user to decide whether or not to request these for any given platform. It is much simpler for the user not to select the plot then for us to keep a list of which platforms may or may not move their positions. If we did this, we'd need, for instance, to remember to always update such list every time a new observation type were added, which is very easy to overlook and thus quite error-prone.

A grid solution, as both of you pointed out, is indeed a much better option. However, obsmon is currently not aware of any domain & grid information at all, and making it grid-aware will certainly require more than just a couple of extra lines of code. Do you know anyone who'd be willing to help with this? Of course I'll do this myself eventually if no one steps in, but I can't prioritise this right now, as I'm involved in other projects.

We (=IPMA) can contribute Paulo, if you find that it can be useful, we have already some python code to do this . The hard part in our coding was indeed to go from scatter lat lon files to grided data. We are very new in obsmon, but willing to help

paulovcmedeiros commented 3 years ago

Hello @isabelmonteiro,

I'm afraid the code needs to be in R. I actually also have code for domains & grids in python (for another project), but porting it involves more than just translating. It would be greate if you could fork the repo, work in your local fork, incorporate the new code there, and then submit a merge request.

paulovcmedeiros commented 3 years ago

Hi @ewhelan and @isabelmonteiro ,

I've started to look into this now, and I have two questions regarding the computation of the averages:

About grid averages: You may have multiple different platforms giving reports within the same grid cell. Should these reported values from different stations belonging to the same grid cell be mixed in the calculation of the averages?
Moving platforms may comprise stations located at a fixed (lon, lat) but at different altitude/pressure levels. At the moment, I preserve level information when calculating the averages (i.e., observations at different levels are not mixed up when calculating averages, even if they belong to the same (lon, lat) in 2D). In the light of this request, I suppose that, instead, observations at equal (lon, lat) should all enter the same average regardless of level. Is that correct?

ewhelan commented 3 years ago

About grid averages: You may have multiple different platforms giving reports within the same grid cell. Should these reported values from different stations belonging to the same grid cell be mixed in the calculation of the averages? Yes, I think so for the same level. For BUOY/SHIP/SCATT this is straightforward - surface. For aircraft/satellite see answer 2.

Moving platforms may comprise stations located at a fixed (lon, lat) but at different altitude/pressure levels. At the moment, I preserve level information when calculating the averages (i.e., observations at different levels are not mixed up when calculating averages, even if they belong to the same (lon, lat) in 2D). In the light of this request, I suppose that, instead, observations at equal (lon, lat) should all enter the same average regardless of level. Is that correct? I would suggest initially following what ECMWF do:

Aircraft: 0-400 hPa/400-700 hPa/700-1000hPa could be used. See https://www.ecmwf.int/en/forecasts/charts/obstat/temp__geo_0001_plot_o_geo_temp?facets=undefined&time=2021082800&Datatype=AMDAR&Layer=0-400hPa&Data=FG%20departure

Satellite: split by satellite/channel https://www.ecmwf.int/en/forecasts/charts/obstat/atovs_amsua__geo_0001_plot_o_geo_atovs_amsua?facets=Instrument,AMSUA&time=2021082800&Satellite=METOP-C&Channel=9&Data=FG%20departure&Flag=Used

I know - easier said than done!

Any opinions @isabelmonteiro

paulovcmedeiros commented 3 years ago

Thanks @ewhelan !

I think I'll code it the way I initially intended then, with all queried levels & stations included in the average at each (lon, lat). This is because:

For aircraft (and other non-satellite obs), the user can already select which levels to include in the queried data. This means that, if a 0-400 hPa window (or any other selection) is required, then the corresponding levels can already be selected in the GUI prior to requesting the plot.
For satellite, a similar situation applies: The user just needs to select the one required satellite/channel combination required.

I have already coded most of the backend stuff related to domain geometry and grid, and have a good idea on how to handle the data in this way I mentioned. I think the most challenging part at the moment will be to implement the visualisation of the girdded data itself, as it seems that the engine I currently use in obsmon for interactive plots doesn't natively support plotting a heatmap on top of a map. But this is just a technicality, there's always a way.

isabelmonteiro commented 3 years ago

Hi @paulovcmedeiros , @ewhelan I was going to answer to 1. the same way as Eoin :), just would like to highlight that it is important the grid averaging is done, for LEO satellites (Metops, NOAAs, etc) over areas larger than the observation footprint. They don't scan exactly the same area every day, a grid large enough should be taken. An example for SCATT from NWPSAF https://nwp-saf.eumetsat.int/site/monitoring/winds-quality-evaluation/scatterometer-mon/scatt-monthly-monitoring/ where you can find typical averaging grid sizes

paulovcmedeiros commented 3 years ago

Hi @isabelmonteiro ,

The grid will be up for the user to choose. This is how the domain config will look like, for instance:

[domain]
    name = "My Domain"
    nlon = 900
    nlat = 960
    lonc = 16.763011639
    latc = 63.489212956
    lon0 = 15.0
    lat0 = 63.0
    gsize = 2500
    ezone = 11
    lmrt = false

I think that adressers your point.

isabelmonteiro commented 3 years ago

I would suggest to have 4 times the satellite footprint size as the averaging grid if this cannot be an option to the user. Also important to inform the user on how many observations were used to compute the averaging. No, it is the grid you use in the averaging

isabelmonteiro commented 3 years ago

ASCAT-coastal has a 12.5 km grid, your o-b statistics only make sense if you use a 0.5 grid at least

isabelmonteiro commented 3 years ago

You cannot use Harmonie grid resolution for guidance regarding the departures statistics it is too fine

paulovcmedeiros commented 3 years ago

I though it would suffice for the user to define a grid in the config file and then use it for everything. I really don't want to hard-code all these parameters for all instruments, this would be a nightmare for maintenance in the long run, as there are many different types of observations and only one person performing the maintenance of the code...

If it is important to have different grids for different observation types, then I will ty to find a way to make it possible to set grid parameters in the GUI instead, so that the user can tune these without the need to restart the code.

paulovcmedeiros commented 3 years ago

You cannot use Harmonie grid resolution for guidance regarding the departures statistics it is too fine

Mind that this grid config is just an example. It is the user who defines whatever grid they want. What I've shown here is just an example of the format the user would employ to specify the desired grid. These are not hard-coded.

isabelmonteiro commented 3 years ago

You could do this just for scatt such a nice observation ;) and you could set the averaging grid size to 50 km.

paulovcmedeiros commented 3 years ago

You could do this just for scatt such a nice observation ;) and you could set the averaging grid size to 50 km.

I would really rather not hard-code these grid parameters. Leaving it up to the user means more freedom for them and less maintenance for me... If I set it to 50 km for scatt, then someone may, for whatever reason, ask me to set it to something else for some special case. And it would also leave the possibility open for everything else to be set by me as well...

isabelmonteiro commented 3 years ago

I think it is very useful information Paulo, Only having global averaging can be misleading. For example for the u10 component o-b has a zonal dependent bias that is not detectable when you do global averaging, it becomes close to 0 and people can live happy ever after with a nasty bias.
But of course I see your point and don't want to push you. This can always be done outside obsmon

paulovcmedeiros commented 2 years ago

This is now available in the devel branch for you to test. I'm attaching some screen prints so you know roughly what to expect.

The new "Domain Geometry & Grid" tab where the domain specs can be entered and/or modified by the user.

obsmon_domain_config_tab_example

The (optional) defaults loaded into the inputs under this tab were specified by adding the following lines to the config file:

[domain]
    nlon = 90
    nlat = 96
    lonc = 16.763011639
    latc = 63.489212956
    lon0 = 15.0
    lat0 = 63.0
    gsize = 25000
    lmrt = false

An example of a plot of the "Average First Guess Departure Map" type. If domain & grid are specified, then it performs grid-averages, as requested.

obsmon_grid_average_example

I think this covers what was requested here (and possibly even more, as, for instance, an old request by @ewhelan to be able to tweak min/max lon/lat in maps -- #6). If you think this looks OK (or if I don't hear anything by tomorrow night), then I'll go ahead and merge these changes into the master branch.

isabelmonteiro commented 2 years ago

Thank you very much @paulovcmedeiros! I think it looks very well. I assume that it possible to perform these diagnostics to other observation types. Very useful!

paulovcmedeiros commented 2 years ago

Hi @isabelmonteiro !

Yes, you can select any observation type, variable, level/channel etc in the usual way.

It would be nice if you could test it a bit before I merge the changes into master, just to see if there's any bug I didn't catch. You and @ewhelan are certainly better suited than me to test real-life cases :)

isabelmonteiro commented 2 years ago

Great! Looking forward to it ! Do you need feedback until tomorrow?

paulovcmedeiros commented 2 years ago

It would be good to have it by the end of the day today, actually. The thing is that I've got a lot of work in other projects right now, and no time allocated at all for obsmon in my budget anymore. So I need to wrap obsmon up for the year ASAP.

isabelmonteiro commented 2 years ago

@fabiolasouza and I quickly tested the Domain&Geometry functionality and it is working very well. It will be quite useful for the observation diagnostics in our experiments. We just have a few suggestions for when you find time:

The colour scale could be configured by the user (perhaps a menu in the left hand side?). The figs below are not really comparable just because the user cannot set the colour scale
Apparently for scatterometers average first guess departures are being computed, near the coast, using 10-m wind over land and this is not ideal (see below a figure where large departures are found apparently because of this). A solution, for scatterometers, might be to mask values at a given distance from the coastline (if possible).
We suggest that averages should only be computed if the number of matches are above a given threshold (eg. N>30). This is clear in the figure below where, for aircraft observations,, average FG departures over the ocean, are very large only because the amount of observations used to perform the statistics was very low (e.g. N=2)

Again, Thank you!!

paulovcmedeiros commented 2 years ago

Hi,

Thanks for this! I'll look into your suggestions. Just about the colorscale: It is possible to change them. You need to click on the "cog" icon at the right-hand side to open a menu that allows you to change both colormap and limits. The icon doesn't show in your figure but it is present in the GUI (see the screen print I attached ealier.)

I'll have a look at the other suggestions tomorrow.

isabelmonteiro commented 2 years ago

Thanks Paulo, we were not aware of this. GREAT!

isabelmonteiro commented 2 years ago

We can even select the colour scale! Genial ;)

paulovcmedeiros commented 2 years ago

Hi again @isabelmonteiro and @fabiolasouza ,

I have now addressed your third suggestion (that averages should only be computed if the number of matches are above a given threshold). Please run

git pull origin devel --rebase && ./install

to fetch and apply the changes. When you select a plot of the "Average Maps" type, and there's a valid domain specified, then a Gridded Average: Min № Obs per Grid Element numeric input should appear in the GUI, just under Type of Plot. You can then define the threshold you wish to apply.

For the second point (use of a land-sea mask), I would say I will probably not implement this, unfortunately. One reason for this is that the observations included in the averages really do belong to those grid elements. My suggestion would be that you choose a grid with a slightly finer resolution. This would reduce the number of grid elements located partially over land and partially over the sea.

Another reason why I'm not too keen on implementing such filters is that they can be a bit inaccurate depending on the specific region, and they also may increase the computation times considerably depending on the size of the datasets used.

Let me know if you have any other suggestions and/or observations.

paulovcmedeiros commented 2 years ago

This has now been merged into master. I'm closing the issue as I think there are no more actions to result from it.

Feel free to open other issue(s) should you have other suggestions.

Many thanks for these suggestions, feedback and for testing :)

Hirlam / obsmon

Moving platforms average maps do not make sense #1