TerriaJS / terriajs

A library for building rich, web-based geospatial data platforms.
https://terria.io
Apache License 2.0
1.17k stars 362 forks source link

Csv-geo-au is growing beyond csv, geo and au #2111

Open RacingTadpole opened 8 years ago

RacingTadpole commented 8 years ago

We are now using csv-geo-au for things:

I wonder if we should pull out each of these pieces into their own standards, and define csv-geo-au in terms of them?

RacingTadpole commented 8 years ago

@LegoStormtroopr suggested in another channel:

What you are describing is recording all of the structural metadata around particular columns.

We could record all of that metadata around those Data Elements, and then with very minor tweaking NationalMap could then interpret that metadata to understand how to display data on NationalMap.

image

Let me see if I understand… do you mean we could format the wordy spec at https://github.com/TerriaJS/nationalmap/wiki/csv-geo-au into a machine-readable form, which TerriaJS can then read?

To which @LegoStormtroopr replied

At the least, for the columns yes.

RacingTadpole commented 8 years ago

Cool! I’m keen to give it a try (with a side motivation to help me get my head around the concept of metadata). How do I start?

tobybellwood commented 8 years ago

This sort of approach came up in a conversation with ABS geography around the concepts of statistical geography yesterday.

I've always been keen to see the csv-geo-* get legs, but hadn't thought of csv-- !

I'm also interested in whether data packages could be used to carry this extended meaning (possibly generated by future dga?) http://frictionlessdata.io/data-packages/

stevage commented 8 years ago

Just fwiw, "geocsv" is some kind of de facto standard used by Mapbox and some others: https://github.com/mapbox/detect-geocsv

RacingTadpole commented 8 years ago

Looks like the crux of that is, first establish that it is csv, then check for these column names:

hasOne(lowerCaseNames, ['wkt', 'geom', 'geometry', 'geojson']) ||
        ((hasOne(lowerCaseNames, ['x', 'lon', 'lng', 'long']) ||
            hasOneThatContains(lowerCaseNames, 'longitude')) &&
        (hasOne(lowerCaseNames, ['y', 'lat']) ||
            hasOneThatContains(lowerCaseNames, 'latitude')));
RacingTadpole commented 8 years ago

@tobybellwood thanks for the data-packages link. That could be useful as another way to tell terriajs how to interpret other files (like our custom "init" json-format does now, but easier to construct). Is that what you meant? I can't see how to use data-packages to define a standard though.

stevage commented 8 years ago

Ok @tobybellwood so I'll just recap how the Data Packages work for everyone's benefit:

All these structures are extensible, and it's ok to include extra attributes not explicitly defined in the spec.

So, what can we do with it?

LegoStormtroopr commented 8 years ago

I'd go a bit further an suggest that if data.gov.au v2.0 is going to have a metadata registry associated with it, defining and publishing all of the structural metadata would be beneficial for a number of reasons:

  1. They would help us demonstrate why metadata is useful to describe in a very granular way.
  2. Means geo-csv is less constrained by column headers - if a column maps to a data element which maps to the latitude concept, then NationalMap can utilise it. Likewise, if it is a state, statistical boundary, etc...
  3. We could (lossy) publish tabular JSON from the metadata

What we'd need to accomplish this are services to: