Informasjonsforvaltning / datacatalogtordf

A library that will map a data catalog (inkl dataset, dataservices and other dcat resources) to rdf
Apache License 2.0
5 stars 0 forks source link

Triples in distribution not added to dataset #6

Closed mikaello closed 3 years ago

mikaello commented 4 years ago

When creating a distribution, I am adding different triples to this, e.g.:

distribution = Distribution()
distribution.identifier = URI("http://example.com")
distribution.title = {"nb": "title"}
distribution.access_URL = URI("http://access.com")
distribution.formats.append(URI("https://www.iana.org/assignments/media-types/application/json"))

When I print this with to_rdf(), everything seems fine, all triples are included:

print(distribution.to_rdf())

# b'@prefix dcat: <http://www.w3.org/ns/dcat#> .\n@prefix dct: <http://purl.org/dc/terms/> .\n\n<http://example.com> a dcat:Distribution ;\n    dct:format <https://www.iana.org/assignments/media-types/application/json> ;\n    dct:title "title"@nb ;\n    dcat:accessURL <http://access.com> .\n\n'

But when I append this to a dataset, all triples except identifer disappears:

dataset = Dataset()
dataset.title = {"nb": "test"}
dataset.identifier = URI("http://id.com")
dataset.description = {"nb": "Ingen beskrivelse"}
dataset.publisher = URI("http://publisher.com")
dataset.distributions.append(distribution)

print(dataset.to_rdf())

# b'@prefix dcat: <http://www.w3.org/ns/dcat#> .\n@prefix dct: <http://purl.org/dc/terms/> .\n\n<http://id.com> a dcat:Dataset ;\n    dct:description "Ingen beskrivelse"@nb ;\n    dct:publisher <http://publisher.com> ;\n    dct:title "test"@nb ;\n    dcat:distribution <http://example.com> .\n\n'

Have I misunderstood something, or is this a bug?

stigbd commented 4 years ago

I think your expecation is to get something like this:

    @prefix dct: <http://purl.org/dc/terms/> .
    @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix dcat: <http://www.w3.org/ns/dcat#> .
    @prefix prov: <http://www.w3.org/ns/prov#> .

    <http://example.com/datasets/1> a dcat:Dataset ;
        dcat:distribution   <http://example.com/distributions/1>,
                            <http://example.com/distributions/2>
        .
    <http://example.com/distributions/1> a dcat:Distribution ;
        dct:title   "API-distribution"@en, "API-distribusjon"@nb
        .
    <http://example.com/distributions/2> a dcat:Distribution ;
        dct:title   "Another API-distribution"@en, "En annen API-distribusjon"@nb
        .

Or

    @prefix dct: <http://purl.org/dc/terms/> .
    @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix dcat: <http://www.w3.org/ns/dcat#> .
    @prefix prov: <http://www.w3.org/ns/prov#> .

    <http://example.com/datasets/1> a dcat:Dataset ;
        dcat:distribution   [ a dcat:Distribution ;
                                     dct:title   "API-distribution"@en, "API-distribusjon"@nb
                                    ] ,
        dcat:distribution   [ a dcat:Distribution ;
                                     dct:title   "Another API-distribution"@en, "En annen API-distribusjon"@nb
                                    ] ,
        .

The first approach is in conformance to https://www.w3.org/TR/vocab-dcat-2/#ex-dataset. To accomplish this, you will have to add the following line:

for d in dataset.distributions:
   print(d.to_rdf())

Alternatively, I could add an option for dataset.to_rdf() for including the distributions, analogous to https://datacatalogtordf.readthedocs.io/en/latest/reference.html#datacatalogtordf.catalog.Catalog.to_rdf