Closed ashleysommer closed 4 years ago
Use a head that looks like this:
@prefix loci: <http://linked.data.gov.au/def/loci#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix void: <http://rdfs.org/ns/void#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
@prefix : <http://linked.data.gov.au/dataset/addrcatch/statement/> .
@prefix l: <http://linked.data.gov.au/dataset/addrcatch> .
@prefix s: <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> .
@prefix p: <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> .
@prefix o: <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> .
@prefix m: <http://linked.data.gov.au/def/loci/hadGenerationMethod> .
@prefix g: <http://linked.data.gov.au/dataset/gnaf/address/> .
@prefix w: <http://www.opengis.net/ont/geosparql#sfWithin> .
@prefix i: <http://purl.org/dc/terms/isPartOf> .
@prefix c: <http://linked.data.gov.au/dataset/geofabric/contractedcatchment/> .
@prefix si: <http://linked.data.gov.au/dataset/addrcatch/SpatialIntersection> .
l: a loci:Linkset ;
dct:title "Addresses Contracted-Catchments Linkset" ;
dct:description """This LOC-I Project Linkset relates Address individuals in the G-NAF LOC-I Dataset to Contracted Catchment individuals in the Geofabric LOC-I Dataset. Every Address -> Catchment relation is geosparql:sfWithin, that is the Address is sfWithin the Catchment.
The Linkset triples (Address sfWithin Catchment) are reified so that each triple is contained within an RDF Statement class instance so that the triple is numbered and the method used to generate the triple is given by the loci:hadGenerationMethod.
The method used for all triples in this Linkset is the same and it is SpatialIntersection which is defined below.
The triples for the main data in this linkset - the Statements relating Addresses to Catchments - are valid RDF in the Turtle syntax but an unusual namespacing arrangement is used so all terms are indicated with as few letters as possible, mostly one letter then colon, e.g. s: for http://www.w3.org/1999/02/22-rdf-syntax-ns#subject, rather than the more common rdf:subject. This is simply to reduce file size."""@en ;
dct:publisher <http://catalogue.linked.data.gov.au/org/psma> ;
dcat:contactPoint _:jo ;
dct:issued "2019-01-30"^^xsd:date ;
dct:modified "2019-01-30"^^xsd:date ;
dct:contributor <http://orcid.org/0000-0002-8742-7730> , <http://orcid.org/0000-0003-0590-0131> ;
void:subjectsTarget <http://linked.data.gov.au/dataset/gnaf> ;
void:objectsTarget <http://linked.data.gov.au/dataset/geofabric> ;
void:linkPredicate w: ;
m: si: .
_:jo a vcard:Individual ;
vcard:fn "Joseph Abhayaratna" ;
vcard:hasEmail <mailto:joseph.abhayaratna@psma.com.au> .
si: a prov:Plan ;
rdfs:label "Spatial Intersection Method" ;
rdfs:comment "This method uses the G-NAF LDAPI to page through the register, obtain the GeoSPARQL geometry for the address point, and then uses a OGC Simple Features Contains filter on the GeoFabric WFS Service"@en ;
prov:value <https://github.com/jabhay/linkset_creator> ;
prov:wasAttributedTo _:jo ;
prov:generatedAtTime "2019-01-30"^^xsd:date .
#
# Statements
#
And head that looks like this for GNAF_2016_05 to CC.
@prefix loci: <http://linked.data.gov.au/def/loci#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix void: <http://rdfs.org/ns/void#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
@prefix : <http://linked.data.gov.au/dataset/addr201605catch/statement/> .
@prefix l: <http://linked.data.gov.au/dataset/addr201605catch> .
@prefix s: <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> .
@prefix p: <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> .
@prefix o: <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> .
@prefix m: <http://linked.data.gov.au/def/loci/hadGenerationMethod> .
@prefix g: <http://linked.data.gov.au/dataset/gnaf-2016-05/address/> .
@prefix w: <http://www.opengis.net/ont/geosparql#sfWithin> .
@prefix i: <http://purl.org/dc/terms/isPartOf> .
@prefix c: <http://linked.data.gov.au/dataset/geofabric/contractedcatchment/> .
@prefix si: <http://linked.data.gov.au/dataset/addr201605catch/SpatialIntersection> .
l: a loci:Linkset ;
dct:title "Addresses Contracted-Catchments Linkset" ;
dct:description """This LOC-I Project Linkset relates Address individuals in the G-NAF LOC-I Dataset to Contracted Catchment individuals in the Geofabric LOC-I Dataset. Every Address -> Catchment relation is geosparql:sfWithin, that is the Address is sfWithin the Catchment.
The Linkset triples (Address sfWithin Catchment) are reified so that each triple is contained within an RDF Statement class instance so that the triple is numbered and the method used to generate the triple is given by the loci:hadGenerationMethod.
The method used for all triples in this Linkset is the same and it is SpatialIntersection which is defined below.
The triples for the main data in this linkset - the Statements relating Addresses to Catchments - are valid RDF in the Turtle syntax but an unusual namespacing arrangement is used so all terms are indicated with as few letters as possible, mostly one letter then colon, e.g. s: for http://www.w3.org/1999/02/22-rdf-syntax-ns#subject, rather than the more common rdf:subject. This is simply to reduce file size."""@en ;
dct:publisher <http://catalogue.linked.data.gov.au/org/psma> ;
dcat:contactPoint _:jo ;
dct:issued "2019-01-30"^^xsd:date ;
dct:modified "2019-01-30"^^xsd:date ;
dct:contributor <http://orcid.org/0000-0002-8742-7730> , <http://orcid.org/0000-0003-0590-0131> ;
void:subjectsTarget <http://linked.data.gov.au/dataset/gnaf-2016-05> ;
void:objectsTarget <http://linked.data.gov.au/dataset/geofabric> ;
void:linkPredicate w: ;
m: si: .
_:jo a vcard:Individual ;
vcard:fn "Joseph Abhayaratna" ;
vcard:hasEmail <mailto:joseph.abhayaratna@psma.com.au> .
si: a prov:Plan ;
rdfs:label "Spatial Intersection Method" ;
rdfs:comment "This method uses the G-NAF LDAPI to page through the register, obtain the GeoSPARQL geometry for the address point, and then uses a OGC Simple Features Contains filter on the GeoFabric WFS Service"@en ;
prov:value <https://github.com/jabhay/linkset_creator> ;
prov:wasAttributedTo _:jo ;
prov:generatedAtTime "2019-01-30"^^xsd:date .
#
# Statements
#
Source CSVs can be found here: https://s3.console.aws.amazon.com/s3/buckets/loci-assets/source-data/gnaf201605-address-geofabric-cc-linkset_source/?region=ap-southeast-2&tab=overview
And pre-built linksets can be found here (for reference): https://s3.console.aws.amazon.com/s3/buckets/loci-assets/linksets/?region=ap-southeast-2&tab=overview
I moved the source files to https://s3.console.aws.amazon.com/s3/buckets/loci-assets/source-data/gnaf-geofab-linkset-source due to a naming inconsistency (it had both 1811 and 1605 version but was labelled 1605)
I have integrated this script in a docker based workflow consistent with other linkset creation tools.
Adapt this script: