How to implement better/more useful tables?

DenoBeno commented 5 years ago

I'm putting this in emikat because I presume that we'll need new emikat views for this, but it is also relevant for the table component.

As of today, it seems that we have resolved the worst EMIKAT-related issues. Calculations are triggered, results are coming (not sure if for all the areas but at least for some), visualization of data in tables and maps works as designed...

However, the tables are pretty much useless as they are now. I thought about things that we could do to improve them and this comes to mind:

limit the data shown to what is pertinent to our presets (currently we allow only one, in the future several) per default, with a possibility to "see all".

@therter : ideally, this would be also done for non-EMIKAT resources, e.g. in the hazards table.

Generate summary tables. Ideally, the user should still be able to choose "show data per cell", but at least the default should IMO be a sum over all cells.

For exposure table, I would like to see a summary information per element at risk, e.g. total population, average, minimal and maximal population (density) per grid cell. and this for all the data resources that the user chooses for including in the report with option to show them all, and/or for all the resources that correspond to the presets with option to show them all

For risk and impact, I would like to see some of these implemented:

summary information for the whole area (total, average, min/max per cell)
show only the data corresponding to preset(s), with option to show all
show data for all impacts (currently we have only one) or for those impacts that user includes in the "data" step (see https://github.com/clarity-h2020/clarity-theme/issues/7#issuecomment-537516564), with possibility to filter only some impacts.

WDYT?

DenoBeno commented 5 years ago

By the way, how to interpret the Hazard tables?

what is the "event descriptor"?
what is the temp95perz?

@humerh, @therter : this table has many columns, some even empty. Not very useful as it is now,

humerh commented 5 years ago

You see the whole table, which we get from ZAMGs statistics, maybe we can hide some columns:

You can query also the name and description of columns: https://service.emikat.at/EmiKatTst/api/table/tab.CLY_HAZARD_EVENTS_STUDY.2036

From the side of EMIKAT all columns are meaninful and necessary. Operationally the columns "urban_area", "country" and "Temp95Perz" are not used, but for checking plausability I would keep them. If you think, that on Drupal side you can omit something it is your responsibility.

DenoBeno commented 5 years ago

Fro the record, here is the table definition

// 20191009105715
// https://service.emikat.at/EmiKatTst/api/table/tab.CLY_HAZARD_EVENTS_STUDY.2036

{
  "id": 8501,
  "name": "Hazard Events for current Study",
  "featurename": "tab.CLY_HAZARD_EVENTS_STUDY.2036",
  "viewName": "CLY_HAZARD_EVENTS_STUDY",
  "dbName": "CLY_HAZARD_EVENTS_STUDY",
  "description": "Lists all statistically relevant Hazard Events for current Study Area.\nFrom the raster of Europe (Hazard Events from Europe) these lines are filtered, where the center of the study area matches the center of the statistics cell closest.",
  "columns": [
    {
      "name": "Oid",
      "label": "Oid",
      "description": "Internal unique ID",
      "type": "integer",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "Latitude",
      "label": "Latitude",
      "description": "Latitude and Longitude are the units that represent the coordinates at geographic coordinate system.",
      "type": "float",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "Longitude",
      "label": "Longitude",
      "description": "Latitude and Longitude are the units that represent the coordinates at geographic coordinate system.",
      "type": "float",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "UrbanArea",
      "label": "Urban Area",
      "description": "Name of Urban Area (if available)",
      "type": "string",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "Country",
      "label": "Country",
      "description": "Country code of the center of cell",
      "type": "string",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "HazardEventTypeId",
      "label": "Hazard Event Type ID",
      "description": "Defines, for which category of event this definition is done.",
      "type": "integer",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "HazardEventType",
      "label": "Hazard Event Type",
      "description": "Defines, for which category of event this definition is done. (Shortname)",
      "type": "string",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "EMISSIONS_SCENARIO",
      "label": "EMISSIONS_SCENARIO",
      "description": "Code of Emissions scenario (Baseline/rcp26/rcp45/rcp85)",
      "type": "string",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [
        {
          "key": "0",
          "meaning": "0 [Baseline]"
        },
        {
          "key": "1",
          "meaning": "1 [rcp26]"
        },
        {
          "key": "2",
          "meaning": "2 [rcp45]"
        },
        {
          "key": "3",
          "meaning": "3 [rcp85]"
        }
      ],
      "colorlist": null
    },
    {
      "name": "TIME_PERIOD",
      "label": "TIME_PERIOD",
      "description": "Time period for which this event is representative:\n 19710101-20001231 (Baseline)\n 20110101-20401231\n 20410101-20701231\n 20710101-21001231",
      "type": "string",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [
        {
          "key": "0",
          "meaning": "0 [19710101-20001231]"
        },
        {
          "key": "1",
          "meaning": "1 [20110101-20401231]"
        },
        {
          "key": "2",
          "meaning": "2 [20410101-20701231]"
        },
        {
          "key": "3",
          "meaning": "3 [20710101-21001231]"
        }
      ],
      "colorlist": null
    },
    {
      "name": "EVENT_FREQUENCY",
      "label": "EVENT_FREQUENCY",
      "description": "Frequency of event (Rare,Occasionally, Frequent)",
      "type": "string",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "EventDescriptor",
      "label": "Event Descriptor",
      "description": "Event Descriptor - Encoded paramters of event\ne.g.HW.38.5_2.600d represents 38.5°C and duration 2.600 days.",
      "type": "string",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    },
    {
      "name": "Temp95Perz",
      "label": "Temp95Perz",
      "description": "95% Percentile of temperature (95% of temperature values are avove this given value)",
      "type": "float",
      "semanticId": 0,
      "unit": "noUnitDefined",
      "valueList": [

      ],
      "colorlist": null
    }
  ]
}

DenoBeno commented 5 years ago

I don't quite understand why this data is served by EMIKAT. And since it is, why doesn't EMIKAT also serve the data for other indices, like "tropical nights"? It should, IMO - so that we can serve a table looking more like what we planned to show - see the mockup:

"Event descriptor" has the data that we need to show to the user, just not in the friendliest way:

Event Descriptor - Encoded parameters of event\ne.g.HW.38.5_2.600d represents 38.5°C and duration 2.600 days.

Furthermore, something is wrong with the "95% Percentile of temperature" column - it always shows the same number. This is probably false. But even if the value would be correct, we already have the 75% value encoded in the event descriptor.

In fact, something is also wrong with the Event descriptor: all durations are way below one day

DenoBeno commented 5 years ago

Based on these considerations, this is what I would propose to do:

allow users to define several presets and only show the data corresponding to the presets. and replace the scenario/frequency/time period with the preset label.

This is similar in spirit to what our mockup suggested btw, except that the "scenarios" would be replaced by presets. See https://github.com/clarity-h2020/emikat/issues/23#issuecomment-539920640

In this way the labels will make sense to the users (==preset names) and there will never be too many combinations that need to be shown.

@humerh : can we have values for other hazard indices in the EMIKAT table too? (how much work)

Always show the data for the whole project area, not for individual cells. E.g. average, standard deviation, maybe also min/max values.

2.1. Since the geo-information is then determined by the project area, there is no need to show it for each cell. Shown it as meta-information above the table or not at all.

This is in-line with the spirit of out mockups. If we can't decide what is medium/low/high, we could at least show the data in these cells and let the users decide. For the "number of consecutive days", we would have to show two values for each cell, like it's already done in "event descriptor" - just formatted in a nicer way.

Optional(?): add a filter to decide which indices are shown (either just one at a time or as many as the user wants to see at a time - tbd.). Alternatively, the user could choose which hazard indices they are interested in on the "data" tab and we only show those in other steps. Or just show it all.

Note that for this to work nicely, it would be beneficial to implement https://github.com/clarity-h2020/data-package/issues/47 first. Once that's done, there will be just one resource per hazard index in the data package, not 20 like we have now.

So, the table could look e.g. a bit like this one:

Except that such compound values are terrible for visualisation in any other form/diagrams. Ah wait! We don't need to state the temperatures, because they must be the same for all presets - 75 percentile temperature is just a function of a past climate in a specific area. => for a specific area we only need to compare the number of days.

p-a-s-c-a-l commented 4 years ago

IMO our current tables are not good but good enough. Looking at the list of still unresolved and way more urgent issues, I don't think that we'll come back to this. So better close the issue to keep focused on our still realistic goal to go live with the system what we have now.

p-a-s-c-a-l commented 4 years ago

Short-Term Implementable improvements are considered here and here.

clarity-h2020 / emikat

How to implement better/more useful tables? #23