GFDRR / rdl-standard

The Risk Data Library Standard (RDLS) is an open data standard to make it easier to work with disaster and climate risk data. It provides a common description of the data used and produced in risk assessments, including hazard, exposure, vulnerability, and modelled loss, or impact, data.
https://docs.riskdatalibrary.org/
Creative Commons Attribution Share Alike 4.0 International
13 stars 1 forks source link

[DATA] Access to global datasets from 3rd parties #31

Closed matamadio closed 11 months ago

matamadio commented 3 years ago

Overview

Global datasets are oftern the only source of data for country scale risk analysis in developing countries. GFDRR gets a lot of request for these kind of data to be used in projects. We need to address if and how these datasets are going to be accessed via RDL and its upcoming workflows.

Example data

Exposure:

Hazard (see also sheet)

Use cases

Options

Options for the inclusion of data have balanced pros and cons.

1. Include in the catalogue as metadata; download links points to original source (download page); link to API (data access) and WMS (data view) whenever available from source (e.g. OSM) (+) No storage used (+) Always up to date (new versions, etc) (-) TIed to original data format, which may not be optimal (in particular for any pre-set analytical tool) (-) Become inaccessible if source is discontinued

2. Include in the catalogue pointing to a copy of the data in the RDL storage (+) Reformat as optimal for workflow and alignment to schema (+) Access granted independently of source changes (-) Storage used (-) Need manual updating of versions

stufraser1 commented 1 year ago

In Short term, let's create metadata in RDL format and provide original URL. Longer term we could look to bring some data across but this is not urgent and would require agreement of each data provider.


From: Mattia Amadio @.> Sent: Monday, March 6, 2023 2:32:45 PM To: GFDRR/rdl-standard @.> Cc: Stuart Fraser @.>; Assign @.> Subject: [GFDRR/rdl-standard] Access to global datasets from 3rd parties (#31)

Overview

Global datasets are oftern the only source of data for country scale risk analysis in developing countries. GFDRR gets a lot of request for these kind of data to be used in projects. We need to address if and how these datasets are going to be accessed via RDL and its upcoming workflows.

Example data

Exposure:

Hazard (see also sheethttps://worldbankgroup-my.sharepoint.com/:x:/g/personal/mamadio_worldbank_org1/EdmT0D0YD1BOqlBYOJNKiioBFsOmaC1mnzcGPPBSxzA5Zw?e=EdN7WZ)

Use cases

Options

Options for the inclusion of data have balanced pros and cons.

  1. Include in the catalogue as metadata; download links points to original source (download page); link to API (data access) and WMS (data view) whenever available from source (e.g. OSM) (+) No storage used (+) Always up to date (new versions, etc) (-) TIed to original data format, which may not be optimal (in particular for any pre-set analytical tool) (-) Become inaccessible if source is discontinued

  2. Include in the catalogue pointing to a copy of the data in the RDL storage (+) Reformat as optimal for workflow and alignment to schema (+) Access granted independently of source changes (-) Storage used (-) Need manual updating of versions

— Reply to this email directly, view it on GitHubhttps://github.com/GFDRR/rdl-standard/issues/31, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AC7PNYURPBC77DCEOPXWYZLW2XYQ3ANCNFSM6AAAAAAVRGNYZA. You are receiving this because you were assigned.Message ID: @.***>

pzwsk commented 1 year ago

Agree that option 1, i.e. referencing and pointing to the original source through the Risk Data Library catalog would already be a nice added value. I would imagine we could communicate on the fact the World Bank has improved its curation of global hazard layers thanks to the Risk Data Library Standard. Let's add this to FY24 workplan.

For option 2, I would need a better understanding of the efforts needed and the potential workflow but I don't think our role would be to become a data warehouse whereby we transform and maintain a copy of the datasets. Rather, we should seek collaboration with global data producers for them to provide the data according to our standards and/or transformation tools.