ScienceCore / climaterisk

Determining Climate Risks with NASA Earthdata Cloud
https://sciencecore.github.io/climaterisk/
Creative Commons Attribution Share Alike 4.0 International
1 stars 5 forks source link

Review: Geographic data formats #32

Closed dhavide closed 2 months ago

dhavide commented 5 months ago

The introductory materials need to include a brief notebook summarising commonly used geographic data formats.

In live teaching, this is probably about 5-10 minutes to deliver (depending on audience prior experience).

kvenkman commented 5 months ago

I think for the contents of this tutorial, it would be sufficient to cover raster and vector data, in the vein of this blog post

This article provides a good explanation about geotiffs and the metadata contained in the header.

Lastly, I think this is a good article to go over about two prominent vector data formats.

dhavide commented 5 months ago

@marielaraj, I've assigned this issue to you to start.

A first PR need not be complicated; simply a Markdown document with relevant bullet items as a high-level overview. This can evolve later to be converted to slides and have a script for the speakers, but start with this. You can parse the links cited in this thread to get started.

@kvenkman, can you please mention in this thread which specific data formats — no more than 4 or 5 — would be most beneficial to cover in this section (GeoTIFF, GeoJSON, etc.)? That is, in the note examples covered in, say #36, #37, & #38, which specific data formats are explicitly used that users would need some familiarity with to understand those notebooks?

kvenkman commented 5 months ago

OPERA DSWx raster data is specified as GeoTiffs - covering this data format is a must. For vector data, covering the GeoJSON and SHP file formats will suffice.

dhavide commented 4 months ago

As a placeholder: we need to have a discussion about coordinate systems and projections somewhere (so that participants can understand relationships between, e.g., latitude/longitude & easting/northing as they may be extracted from granule metadata or for specifying an AOI).

I don't yet know where this discussion belongs; for now, I think it belongs here in the material on geographic data formats (because coordinates & projection parameters are likely embedded as metadata within geographic data formats).

dhavide commented 4 months ago

OPERA DSWx raster data is specified as GeoTiffs - covering this data format is a must. For vector data, covering the GeoJSON and SHP file formats will suffice.

To make sure that I understand Karthik's answer, in this section of our tutorial, we must cover/explain the basic concepts of:

For the time being, @marielaraj , work on a brief introduction/explanation of these three data formats in a Markdown file or notebook. Explain in enough detail that someone with minimal familiarity with geospatial data can work out what kind of data is actually pulled down in this tutorial & how to use it.

@kvenkman: in the OPERA application examples that we are currently using in this tutorial, are there any raster data formats other than GeoTiffs that we need to explain? If we draw on one of the other OPERA examples, would we conceivably need to explain any other raster format?

kvenkman commented 4 months ago

@dhavide Nope, we'll only be using GeoTiffs in this tutorial. We can make a passing mention to cloud-optimized geotiffs (COGs), but they won't be used here.