dtcenter / MET

Model Evaluation Tools
https://dtcenter.org/community-code/model-evaluation-tools-met
Apache License 2.0
79 stars 24 forks source link

Consider enhancing the masking logic used by the MET statistics tools by adding sid_exc option #401

Closed dwfncar closed 10 years ago

dwfncar commented 10 years ago

This idea originated from a MET-Help question sent by Travis Wilson from UCLA:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67028


He was trying to compute statistics over some area but remove a single station known to contain bad data. The MesoScale modelling group faces similar challenges when verifying over CONUS in that they'd like to be able to remove certain stations known to contain bad data.


The immediate goal is to provide a way of excluding certain stations from Point-Stat and Ensemble-Stat. This could be done in a couple of ways...


(1) A quick and easy solution would be to add a new configuration option with a list of stations known to be bad. These stations would be excluded from all verification tasks for that run. Just add a sid_exclude option to the "mask" section of the config to specify the stations to be skipped:
   mask = {
      grid = [ "FULL" ];
      poly = [];
      sid = [];
      sid_exclude = [ "list_of_sid_to_exclude.txt" ];
   };



(2) Here's a more robust solution that would require a lot more code changes and would break existing configuration files. We could overhaul how masks are defined by making the "mask" section of the config file an array. Each entry in the array could contain settings for name, grid, poly, sid_include, sid_exclude, and a join flag, where the join flag specifies whether the intersection or union of the masks should be used. That would provide much finer control over the masking logic, but would break existing config files. Here's an example:
   mask = [
      { name = "mask1"; grid = ["FULL"]; sid_exclude = ["KDEN", "KATL"]; join=UNION; },
      { name = "mask2"; sid_include = ["georgia_stations.txt", "oregon_stations.txt"]; join=INTERSECTION; }
   ]; [MET-401] created by johnhg

dwfncar commented 10 years ago

We discussed this at the MET Development meeting on 5/15/2014 and decided to go with option 1. Just add the "sid_exclude" option for point_stat and ensemble_stat. The sid's listed there will be excluded from all masks. by johnhg

dwfncar commented 10 years ago

Talked with Michelle about this to brainstorm ideas. Add a new configuration entry named "sid_exc" - the "sid" part is consistent with the "mask.sid" entry while the "exc" part is consistent with the tc_stat config file entries for "init_exc" and "valid_exc". Set sid_exc as an array where each entry is either the name of a group of station id's or a single station id name. That's consistent with how the "sid" setting works. However, all the entries in sid_exc are put into one long list of station id's to be excluded.


Rather than adding "sid_exc" to the "mask" section of the config file, add it to the "obs.field" section. That way, you can specify stations to be excluded on a task-by-task basis. For example, if the winds are bad for some station, you might only exclude that station when verifying winds. by johnhg

dwfncar commented 10 years ago

Add a sid_exc option for Point-Stat and Ensemble-Stat to specify which stations id's should be excluded on a verification task-by-task basis. by johnhg