Open nvkelso opened 7 years ago
I would only change this to be src:via
or and equivalent prefix
+ ":" + key
pair, to be consistent with everything else.
Works for me :)
Seems like most the above applies to the whosonfirst-data
repo.
To give credit to our src:via
sources we'll also need to elevate some of the buried remarks (like for Quattroshapes) so they are listed directly in the big sources README so there is one page with all the sources on it for consumers of Who's On First data to link to in their apps for proper and good credit where credit is due.
All need to print out in a section under https://github.com/whosonfirst/whosonfirst-sources/blob/master/sources/README.md#quattroshapes
After license
bullet point, a new paragraph with:
This source includes data from the following organizations:
With bullet points listed below, alphabetically eg:
And that list needs to be from a new JSON list in the quattroshapes.json source.
Ideally it could contain HTML text with hyperlinks (?) since I think we had problems with Markdown before.
The textual description part of this here in the sources repo is done.
Leaving this issue open as there is related work to followup about.
For this county in Tanzania:
Let's pretend it has the following properties:
"src:geom"
= "meso"
"src:geom_alt"
= ["naturalearth","quattroshapes"]
"meso:source"
= "TNBS"
"qs:source"
= "statscan"
We want to track generically the sources sources in predictable machine readable way, and in a way that doesn't need constant shuffling around as default and alt geoms are shuffled around, and without adding more sources JSONs, and making use of the existing "src:via"
properties in the sources JSON we added recently. In this case Mesoshapes includes data from "TNBS" and let's pretend like quattroashapes includes data from "statscan".
NOTE: This new property would only be added in cases of WOF records where multiple sources exist for a source (eg Mesoshapes, Quattroshapes, and other *shapes sources), then all sources would be listed out in the extended format. Else no change if not multiple source sources.
We propose to add a new "src_via
" prefix that accepts the same property names as src
, but stores as list of lists (versus string for geom and list for geom_alt) because any one source can actually be composed of multiple sources:
"src_via:geom"
= [["meso:tza_tnbs"],["naturalearth"],["quattroshapes:statscan"]]
Another example:
"src_via:population"
= [["statoids:othercensus"],["uscensus"]]
Then in the sources repo (this repo), modify the meso.json:
From:
"src:via" : {
"context": "Tanzania",
"source_link": "",
"source_name": "Tanzania National Bureau of Statistics (TNBS)",
"source_note": ""
},
Add: "source_code": "tza_tnbs"
"src:via" : {
"context": "Tanzania",
"source_link": "",
"source_name": "Tanzania National Bureau of Statistics (TNBS)",
"source_code": "tza_tnbs",
"source_note": ""
},
Does this need to be a different structure?
"src_via:geom"={
"meso":[
"tza_tnbs"
],
"naturalearth":[
"naturalearth"
],
"quattroshapes":[
"statscan"
]
}
And should we riff on "src:via"
ala "src_via"
instead of "src_src"
? (updated to src_via
).
@nvkelso - the example in https://github.com/whosonfirst/whosonfirst-sources/issues/40#issuecomment-399602996 makes more sense.
Flagging @thisisaaronland for comments. We'd like to make this change next week.
With regards to the source_code
key I would change it to source_prefix
since that's what it is.
Likewise I would consider changing all the source_*
keys to be src:*
since the src
prefix has historically been used as a pointer to "whosonfirst-sources".
src_via
seems fine but I am not sure I understand why some of the examples have lists of lists, like this:
"src_via:geom" = [["meso:tza_tnbs"],["naturalearth"],["quattroshapes:statscan"]]
Like why wouldn't it just be:
"src_via:geom" = ["meso:tza_tnbs","naturalearth","quattroshapes:statscan"]
With regards to the source_code key I would change it to source_prefix since that's what it is.
👍
src_via seems fine but I am not sure I understand why some of the examples have lists of lists, like this:
That's because some sources include multiple sources so they need to be lists of lists.
Okay.
Likewise I would consider changing all the source_ keys to be src: since the src prefix has historically been used as a pointer to "whosonfirst-sources".
@stepps00 ⏫
Right now we have data from Quattroshapes which is actually originates from multiple difference sources. Each source needs to be credited, so we need a consistent WOF property to deal with this.
I propose a new property like
src:via
(wassrc_via
originally) where thesrc
should state the original source, and then we should credit the data aggregator insrc:via
as well.Examples:
"qs:source"
value of"AUS Census"
(should just be US Census, oops) and"src:geom"
ofquattroshapes
."src:geom"
should beuscensus
instead, with"src_via"
set toquattroshapes
"meso:source"
value of"EDP"
, though no EDP.json file is currently in the sources folder."src:geom"
should beeep
instead, with"src_via"
set tomeso