Closed ColmMassey closed 3 years ago
Adapting the existing ICA queries, there's a locality
field, not sure if it corresponds to Region
But assuming so for now, this gets a distinct ordered list of them:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
PREFIX essglobal: <https://w3id.solidarityeconomy.coop/essglobal/V2a/vocab/>
PREFIX : <https://dev.lod.coop/ica>
SELECT DISTINCT ?locality
WHERE {
?uri rdf:type essglobal:SSEInitiative .
?uri essglobal:hasAddress ?addr .
?addr vcard:locality ?locality .
OPTIONAL { ?addr vcard:country-name ?country . }
}
ORDER BY ?locality
These do seem a bit ad-hoc.
And for country:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
PREFIX essglobal: <https://w3id.solidarityeconomy.coop/essglobal/V2a/vocab/>
PREFIX : <https://dev.lod.coop/ica>
SELECT DISTINCT ?country
WHERE {
?uri rdf:type essglobal:SSEInitiative .
?uri essglobal:hasAddress ?addr .
?addr vcard:country-name ?country .
}
ORDER BY ?country
Likewise, I wonder how consistent the ICA countries are.
Region and Super-region will be similar I expect, but need the schema to be adapted to include this information.
The devil will be in the details of fixing inconsistencies.
I am pretty sure now that we are not correctly loading the required data into virtuouso for us to be able to make the queries required to generate the list of supportied terms for Typology, Structure or Economic activity. I had thought that the data for Structure might have been present, as it looked like organisational-structure.skos was being loaded into every graph. The following query works if you directly load the file organisational-structure.skos which sits here
https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/organisational-structure
prefix skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?s ?o
WHERE {
?s <http://www.w3.org/2004/02/skos/core#prefLabel> ?o
FILTER (lang(?o) = "EN")
}
ORDER BY ?o
The same query returns empty handed if attempted on the sparql server.
I think we can postpone changing what we load into virtuoso for the moment. All we need is a Javascript equivalent of the python rdflib library. The following short script will list all the current labels for each VES.
def listTerms ( sSKOSName, sLanguage ):
sSKOSURI = "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/" + sSKOSName
sQuery = '''
prefix skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?s ?o
WHERE
{
?s <http://www.w3.org/2004/02/skos/core#prefLabel> ?o
FILTER (lang(?o) = "''' + sLanguage + '''")
}
ORDER BY ?o
'''
g = rdflib.Graph()
g.parse(sSKOSURI)
qres = g.query(sQuery)
for row in qres:
print(row)
#end def
I had an idea about how we could encode which parts of ESSGlobal were of interest to the different projects. For example, we know ICA is only interested in a subset of the organisational-structure terms. We could tag each of these terms in the data, thus reducing the exception handling javascript even further.
So listTerms would just have and extra optional parameter indicating if a client specific subset was required.
So we would add something like the following to say that a specific term is used by the ICA.
<https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/organisational-structure/OS180> <http://schema.org/DefinedTerm> <https://dbpedia.org/page/International_Co-operative_Alliance>
or
<https://dbpedia.org/page/International_Co-operative_Alliance> <http://schema.org/hasDefinedTerm> <https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/organisational-structure/OS180>
One of these for each term used by the ICA
@wu-lee. I recently said that we should do the querying of the ESSGlobal terms via our Sparql endpoints, but from my recent testing we can't do that right now, and it is remarkably easy to do by getting single files, as I demonstrate above in python. So I think it would be pretty easy for @King-Mob to identify a javascript library with the same functionality and port the script. Do you agree? (The endpoints are useful when we are doing multi graph queries.)
@ColmMassey from my perspective, I could port that script assuming the right library exists
The easiest thing that would work from a Javascript perspective is to get the sausage machine to write a JSON index of these things onto our static data server somewhere. No libraries needed at all then. Possibly JSON-LD, but I don't yet know much about that.
The easiest thing
I can't see how that would help us in the short term, but providing JSON-LD serialisations for everything we publish is on our roadmap.
In sea-map, loading definitions from JSON is essentially a one-liner, no extra cognitive overhead to research and install new libraries. The RDF munging would all be done in Ruby, as it is already. So definitely a shortcut IMO.
And yes, it aligns with future work.
Okay. Next step will be to look at JSON-LD serialisation of ESSGlobal SKOS's to see what they look like and if they could by used as is by @King-Mob. Need to consider future compatibility with internationalisation.
Then need to look at options to restrict vocabularies to subsets of the full set for different users.
@King-Mob can you load this json file and see how easily you can query it to get a list of org structure strings, in english. I couldn't attach it as a json file for some reason, so renamed suffix from json to txt. organisational-structure.txt
@wu-lee suggests you should be able to do it without any rdf libraries.
So I've deployed the ICA data to the dev Virtuoso server with all the relevant .skos files (I think). It copies the files from the vocabs.solidarityeconomy.coop website and imports them. Specifically, before it included only:
Now it also includes
I've been trying to query it and get something meaningful - essentially a list of terms for a category. RDF is XML and as such makes my eyes bleed, so I've converted the activities.skos
file into turtle format. A snippet, this defines assertions about the activity A01
:
<http://purl.org/essglobal/standard/activities/a01> a skos:Concept ;
skos:altLabel "A",
"Agriculture, forestry and fishing" ;
skos:inScheme <http://purl.org/essglobal/standard/activities/> ;
skos:prefLabel "Agriculture and environment"@EN,
"Agricultura y medio ambiente"@ES,
"Agriculture et environnement"@FR,
"Agricultura e ambiente"@PT ;
skos:scopeNote "Integrates the sectors of agriculture, forestry and fishing"@EN,
"Integra los sectores de la agricultura, la silvicultura y la pesca"@ES,
"Intègre les secteurs de l'agriculture, la sylviculture et la pêche"@FR,
"Integra os setores da agricultura, silvicultura e pesca"@PT .
So I was expecting something like this would work to get a list of activity terms:
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT * WHERE {
?a a skos:Concept .
?a skos:inScheme <http://purl.org/essglobal/standard/activities/> .
}
But it gets nothing. @ColmMassey, do you understand why this doesn't work?
Similarly, on the SKOS reference page it says:
The graph below states that
<MyConcept>
is a SKOS concept (i.e., an instance ofskos:Concept
).<MyConcept> rdf:type skos:Concept .
However, this query gets nothing out of the database:
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT * WHERE {
?c rdf:type skos:Concept .
}
In practice, it seems I have to do this:
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT * WHERE {
?c rdfs:subClassOf skos:Concept .
}
so I've converted the
activities.skos
file into turtle format
Nice move. Much easier to read.
PREFIX skos: http://www.w3.org/2004/02/skos/core#
SELECT DISTINCT * WHERE { ?a a skos:Concept . ?a skos:inScheme http://purl.org/essglobal/standard/activities/ . }
SELECT DISTINCT ?s ?p ?o
FROM <https://dev.lod.coop/ica>
WHERE
{
?s ?p ?o .
FILTER (regex(?o, "http://www.w3.org/2004/02/skos/core#Concept", "i"))
}
Returns the triples defining each vocabulary as subclasses of skos:Concept
SELECT ?s
FROM <https://dev.lod.coop/ica>
WHERE
{
?s <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.w3.org/2004/02/skos/core#Concept> .
}
lists the vocabs on their own
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX w3: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?s
FROM <https://dev.lod.coop/ica>
WHERE
{
?s w3:subClassOf skos:Concept .
}
Still works, but if you use rdf: instead of w3: you get nothing back
Still works, but if you use rdf: instead of w3: you get nothing back
I think that's because http://www.w3.org/1999/02/22-rdf-syntax-ns
defines RDF (which doesn't include subClassOf
), whereas http://www.w3.org/2000/01/rdf-schema
defines the extension RDFS (which does).
I'd expect an error about this however - SPARQL seems to carry on regardless.
But it gets nothing. @ColmMassey, do you understand why this doesn't work?
For a start, if I run
prefix skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?s ?o
WHERE
{
?s <http://www.w3.org/2004/02/skos/core#prefLabel> ?o
FILTER (lang(?o) = "EN")
}
ORDER BY ?o
on a graph I get directly from
https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities
I get what I expect
'https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a14', 'Administration and management, tourism, rentals')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a01', 'Agriculture and environment')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a18', 'Arts, culture, recreation and sports')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a06', 'Construction, public works and refurbishing')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a03', 'Craftmanship and manufacturing')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a16', 'Education and training')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a04', 'Energy production and distribution')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a11', 'Financial, insurance and related activities')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a12', 'Habitat and housing')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a09', 'Hospitality and food service activities')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a20', 'Household activities, self-production, domestic work')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a10', 'Information, communication and technologies')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a21', 'International diplomacy and cooperation')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a19', 'Membership activities, repairing and wellness')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a02', 'Mining and quarrying')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a13', 'Professional, scientific and technical activities')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a15', 'Public administration and social security')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a05', 'Recycling, waste treatment, water cycle and ecological restoration')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a17', 'Social services, health and employment')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a07', 'Trade and distribution')
('https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a08', 'Transport, logistics and storage')
But calling the same query on the virtuoso server web front end returns nothing, so I think the rdf may not being loaded completely?
Now it also includes
* activities.skos * activities-ica.sk
I did a string match on the entire dev triple store for the string "Administration and management, tourism" and found nothing, so I think the activities skos is not being loaded yet.
Hmm. You may be right, however, I can't see why not. I have re-imported to check it's not been overwritten by the sausage machine, and I still can't find that label immediately after.
Saying that, if iI query:
select * where { ?s ?p ?o. } # with default graph https://dev.lod.coop/ica
And then search the resulting HTML table in the browser, I don't find some labels in organisational-structure.skos
either ("OS10" is missing; "not-for-profit" is not there except in descriptions; nor is "communuty group")
I have to go now, will look again later, but this is quite frustrating.
Poking around in the Virtuoso forum / docs, I think we are doing the right thing when bulk loading these extra .skos files, but we are not checking for errors.
This is the documentation on bulk loading:
http://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader#Checking%20bulk%20load%20status
This thread describes the suggested way to bulk load data. Notice the checks:
https://community.openlinksw.com/t/virtuoso-bulk-loader-and-database-problem/672/4
The immediate thing I'm trying is:
check the DB.DBA.load_list to confirm all data sets were loaded successfully. This is indicated by an ll_state value of 2 and an ll_error value of NULL...
select * from DB.DBA.LOAD_LIST where ll_error IS NOT NULL
However when I query for results there are so many I had to kill the query (there are 80,000ish errors). After some more digging and RTFM (Virtuoso's SQL grammar is not like MySQL's or PostgreSQL's in that it doesn't support LIMIT, and the detail that you use TOP instead are buried so deeply it took me some time to find), I have managed to query the errors from the bulk import I made the other day. These are from one import on the ICA dataset:
SQL> select top 100 * from DB.DBA.load_list where ll_started > stringdate('2021.2.12 15:00.00') and ll_graph like '%ica';
ll_file ll_graph ll_state ll_started ll_done ll_host ll_work_time ll_error
VARCHAR NOT NULL VARCHAR INTEGER TIMESTAMP TIMESTAMP INTEGER INTEGER VARCHAR
_______________________________________________________________________________
/var/tmp/virtuoso/BulkLoading/2021212182444//activities-ica.skos https://dev.lod.coop/ica 2 2021.2.12 18:24.50 0 2021.2.12 18:24.50 0 0 NULL 37000 SP029: TURTLE RDF loader, line 3: syntax error
/var/tmp/virtuoso/BulkLoading/2021212182444//activities-modified.skos https://dev.lod.coop/ica 2 2021.2.12 18:24.50 0 2021.2.12 18:24.50 0 0 NULL 37000 SP029: TURTLE RDF loader, line 3: syntax error
/var/tmp/virtuoso/BulkLoading/2021212182444//activities.skos https://dev.lod.coop/ica 2 2021.2.12 18:24.50 0 2021.2.12 18:24.50 0 0 NULL 37000 SP029: TURTLE RDF loader, line 4: syntax error
/var/tmp/virtuoso/BulkLoading/2021212182444//all.rdf https://dev.lod.coop/ica 1 2021.2.12 18:24.50 0 NULL 0 NULL NULL
/var/tmp/virtuoso/BulkLoading/2021212182444//base-membership-type.skos https://dev.lod.coop/ica 2 2021.2.12 18:24.50 0 2021.2.12 18:24.51 0 0 NULL 37000 SP029: TURTLE RDF loader, line 3: syntax error
/var/tmp/virtuoso/BulkLoading/2021212182444//essglobal_vocab.rdf https://dev.lod.coop/ica 1 2021.2.12 18:24.50 0 NULL 0 NULL NULL
/var/tmp/virtuoso/BulkLoading/2021212182444//organisational-structure.skos https://dev.lod.coop/ica 2 2021.2.12 18:24.50 0 2021.2.12 18:24.51 0 0 NULL 37000 SP029: TURTLE RDF loader, line 3: syntax error
/var/tmp/virtuoso/BulkLoading/2021212182444//qualifiers.skos https://dev.lod.coop/ica 2 2021.2.12 18:24.50 0 2021.2.12 18:24.51 0 0 NULL 37000 SP029: TURTLE RDF loader, line 2: syntax error
8 Rows. -- 68 msec.
So the only files which seem to load without error are:
And what this implies is that the organisational-structure.skos file hasn't been importing due to syntax errors for who knows how long (and I'd guess this is why there are so many results).
Checking organisational-structure.skos for errors by translating it to TTL using rdfpipe
(as I did for activities.skos) finds nothing - the translation seems to be successful. As does validating it here:
So I suspect this must be something specific to Virtuoso and/or SKOS... and luckily my hunch that it's just the file suffix turns out to be right: if I change the sausage-machine uploader to write .rdf files instead of .skos files, they import without error.
And bingo - this query now gets the results I expect:
SELECT DISTINCT *
FROM <https://dev.lod.coop/ica>
WHERE {
?x a skos:Concept .
?x skos:inScheme <https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/> .
}
Specifically (as CSV):
"x"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a01"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a02"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a03"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a04"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a05"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a06"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a07"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a08"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a09"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a10"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a11"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a12"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a13"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a14"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a15"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a16"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a17"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a18"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a19"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a20"
"https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a21"
And in JSON:
{ "head": { "link": [], "vars": ["x"] },
"results": { "distinct": false, "ordered": true, "bindings": [
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a01" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a02" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a03" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a04" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a05" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a06" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a07" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a08" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a09" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a10" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a11" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a12" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a13" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a14" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a15" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a16" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a17" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a18" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a19" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a20" }},
{ "x": { "type": "uri", "value": "https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities/a21" }} ] } }
I'm still a bit puzzled that some of these files seem to have ll_status
1 (incomplete) rather than 2 (complete, possibly with errors), which suggests the import isn't entirely done. My import of organisational-structure.rdf had an ll_state
of 1 (incomplete), and yet if I use the query above adapted for the right vocab, the correct list of 25 terms are evidently now present in the database.
SELECT DISTINCT ?term ?label
FROM <https://dev.lod.coop/ica>
WHERE {
?term a skos:Concept .
?term skos:inScheme <https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/activities-ica/> .
?term skos:prefLabel ?label
FILTER (lang(?label) = "en")
}
then pulls in the labels and selects for a language.
Here's how we merge terms lists using UNION in Sparql
SELECT DISTINCT ?term ?label
FROM <https://dev.lod.coop/ica>
WHERE {
{
?term a skos:Concept .
?term skos:inScheme <https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/organisational-structure/> .
?term skos:prefLabel ?label
FILTER (lang(?label) = "en")
}
UNION
{
<https://w3id.solidarityeconomy.coop/essglobal/V2a/standard/qualifiers/q09> skos:prefLabel ?label .
?term skos:prefLabel ?label
FILTER (lang(?label) = "en")
}
}
but I still think it is better to construct ICA specific vocabs by combining terms from standard vocabs, when we can.
I've been having trouble knowing what parts of SPARQL Virtuoso supports, so I asked a question on their forum:
[Edit] More specifically, I'd like to know:
?term skos:inScheme <whatever>
constraints. Possible answer here seems to use 1.1. features like VALUES
which Virtuoso 6.1 doesn't seem to supportBIND lang(?label) as ?lang
might do the job if it were supported, but again it seems not to be.I plan to add questions for these too, and link to them here.
Here's a question about the language field:
I should add here for later that, on examination, I can see that localisation/languages in SPARQL could be a bit of a rabbit hole. For example:
So:
"Dog"@en
is localised, as is "Chien"@fr
, but "Pooch"
is not localised.SELECT * WHERE { ?whatever skos:prefLabel ?label . FILTER (lang(?label) = "EN")}
won't find "Pooch".)FILTER (lang(?label) = "EN")
will find "Dog"@en
but not "Mutt"@en-gb
.FILTER (langMatches(lang(?label), "en")
will find both "Dog"@en
and "Mutt"@en-gb
. But not "Pooch"
.OPTIONAL
). COALESCE
may help here.All in all, if the data is well controlled we might not have to worry about this. It's only if we have nonuniform data we might find getting a query which pragmatically returns a sensible set of strings a bit of a headache.
Even the list of allowed language identifiers is a bit opaque - searching for one I got referred to BCP47 but it doesn't have a simple list. In fact the syntax seems to be extensible and by implication, non-finite... There's a registry, but it isn't referenced directly. Googling for "iana subtag registry" finds this:
https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
But also this, which provides the data in JSON, rather than in record-jar format, which is "hard to parse".
Reading the Wikipedia page on the matter is enlightening, if only by hinting at the complexity of the standard.
All in all, if the data is well controlled we might not have to worry about this.
We will need to insist that all literals specify the language and that any language supported needs to be fully supported. I can see this could be a pain as we incrementally add terms and need to wait for translations to be done on the new terms.
@wu-lee closing this one
For Super region Region Typology Structure Economic activity
write the sparql queries which return a list of the filter options available for these fields. (Region & Super region need to be implemented)