Include "value-per-country" variables

esa-esdl / esdl-core

ESDL Cube Generation and Access API

GNU General Public License v3.0

15 stars 6 forks source link

Include "value-per-country" variables #13

Open forman opened 8 years ago

forman commented 8 years ago

Some source variables, especially socio-economic data are given as (Excel/CSV) tables and provide some statistical value per country.

We could support such data by proividing an auxilary raster-data worldmap which can look up each country code by given lat/lon. With this country code we can look up the ssociated value from the given source table data.

Since this data is also usually sparse in time (i.e. resulting from annual reports), issue #12 also applies here.

gdkrmr commented 8 years ago

I see the following issues here:

Countries are not static in time (e.g. Palestine, South Sudan)
What to do with pixels that are only partially covered by countries and pixels that are covered by two countries.

forman commented 8 years ago

"Countries are not static in time": We may have annual country code maps. However, the maps we produce that way are associated with extraordinary uncertainties anyway, especially for those countries with changing border lines.
"Partially covered by countries": the same problem applies to all values that represent a classification. We may create a new separate issue for this. In CCI Landcover, we create primary and secondary classifications, so we yield actually two variables from one.

meggart commented 8 years ago

I would support Norman's initial suggestions. If the data goes into the datacube per country we don't destroy any information a priori.

Then we provide tools to use the primary and seondary classification maps to bring this data onto the grid. However if the user wants to do this step in a different way he can do so, too. The main advantage I see here is that we can apply the same gridding approach also for or classificartion as PFT maps for PFT-specific properties or, as Norman mentioned, LandCover specifications.

Thinking about the API we would have to think about what is returned if a user calls

cdata.get("GDP",time=datetime(2001,1,1))

will this internally do the gridding and return a lon-lat map or will a GDP per country be returned? I would prefer to return the gridded maps by default and have special methods/options for requesting by-country or by-LandCover type data.

gdkrmr commented 8 years ago

Here are the World Bank and UN Country Codes: http://wits.worldbank.org/wits/wits/witshelp/Content/Codes/Country_Codes.htm

The world Bank Country Codes are NOT equal to the 3 letter ISO Codes!!!

The 3-letter ISO Country Codes: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3

The 2-letter ISO Country Codes: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2

forman commented 8 years ago

There is already a Python package which can read the data: http://wbdata.readthedocs.io/en/latest/