GSS-Cogs / family-trade

1 stars 2 forks source link

DIT-UK-tariffs #114

Open Shannon95 opened 3 years ago

Shannon95 commented 3 years ago

https://github.com/GSS-Cogs/family-trade/tree/master/datasets/DIT-UK-tariffs

rossbowen commented 3 years ago

@AndrewtGSScogs

I think this is an alright method for getting column metadata in place. Tested it and it appears in the .ttl output. The other columns will need to be populated.

https://github.com/GSS-Cogs/family-trade/blob/master/datasets/DIT-UK-tariffs/observations.csv-metadata.json#L89-L94

    "rdfs:seeAlso": [
        {
            "@id": "http://gss-data.org.uk/data/gss_data/trade/DIT-UK-Tarriffs/commodity",
            "rdfs:label": "Commodity",
            "rdfs:description": "Commodity Name"
        },
AndrewtGSScogs commented 3 years ago

ahhh amazing!

thanks v much! ill sort that out today!

Cheers, A

On Thu, 18 Mar 2021 at 07:36, rossbowen @.***> wrote:

@AndrewtGSScogs https://github.com/AndrewtGSScogs

I think this is an alright method for getting column metadata in place. Tested it and it appears in the .ttl output. The other columns will need to be populated.

https://github.com/GSS-Cogs/family-trade/blob/master/datasets/DIT-UK-tariffs/observations.csv-metadata.json#L89-L94

"rdfs:seeAlso": [
    {
        ***@***.***": "http://gss-data.org.uk/data/gss_data/trade/DIT-UK-Tarriffs/commodity",
        "rdfs:label": "Commodity",
        "rdfs:description": "Commodity Name"
    },

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GSS-Cogs/family-trade/issues/114#issuecomment-801701759, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASBJOQPXGJNVEP3RF5BHZY3TEGURNANCNFSM4U7KY4LQ .

AndrewtGSScogs commented 3 years ago

Have uploaded this to PMD4 as a draft the concepts appear to be working unlike before and i have dealt with the string/decimal issue.

Just needs a review if anything further is required please

Cheers, A

AndrewtGSScogs commented 3 years ago

Although this was displaying correctly initially a change has triggered it to fail giving a 500 error.

Will investigate and rebuild to see if that fixes the issue

Robsteranium commented 3 years ago

I think the 500 was due to have multiple labels for some columns (possibly because there were two copies of this dataset). An as yet undeployed fix to PMD ought to resolve these issues.

Robsteranium commented 3 years ago

Great to see this reference data loaded. Thanks @AndrewtGSScogs @Shannon95.

In terms of the data:

We might want to begin thinking about how we will handle changes to the value of the tariffs themselves. It would be nice to keep track of historical values (e.g. adding an application from/to date to each tariff). It's important to think about this now (even if the upstream data is modified in place) as it could affect the URI scheme.

In terms of the UI/UX (cc @RicSwirrl re: the resource view of this tariffs ref data):

An obvious application of this data would be to compare with e.g. OTS to estimate the impact of tariff changes on trade flows (you could imagine a per commodity elasticity measure or a Brexit impact analysis). We could look at providing this analysis using SPARQL to extract the data.

Robsteranium commented 3 years ago

Spoke to Ric on Slack, he noted that csv(w) download and filters for generic datasets are something we'd like to do eventually.

He also spotted that the page is offering views of the catalog metadata resource types when it shouldn't be (these are just for the catalogue, they're not supposed to be in the dataset). This appears to be due to problems with the metadata.

I've just run the pmd4 validation suite on this dataset (having downloaded it as ntriples):

clojure -M:pmd4:validate -e gss-data-org-uk-tariffs-on-goods-imported-into-the-uk/data.nt | less

This shows 6 failures.

The first two can be ignored. They're about skos:Concepts needing schemes and labels. As above, we should set the type of these resources to something else.

The two checks for the existence of dates fail because the dataset is declared in two graphs: http://gss-data.org.uk/graph/metadata/data/trade/tariffs-on-goods-imported-into-the-uk and http://gss-data.org.uk/graph/trade/tariffs-on-goods-imported-into-the-uk. The other failures (SELECT_DatasetContentsIsCorrectType.sparql and SELECT_RecordPrimaryTopicIsDataset.sparql) also arise because the dataset metadata isn't quite structured right.

The instructions for how this ought to work are in the PMD metadata instructions on the cogs-issues wiki although tbh they aren't very clear for the "dataset of arbitrary RDF" case we're working with here.

I think we want something like the following trig. There's a metadata graph with the record and catalog entry separate from the data graph. Note we're using the data graph URI as the dataset contents, describing this as a pmdcat:GraphDatasetContents (with the description also in the metadata graph):

<http://gss-data.org.uk/graph/trade/tariffs-on-goods-imported-into-the-uk-metadata> {
  <http://gss-data.org.uk/data/trade/tariffs-on-goods-imported-into-the-uk-record> a dcat:CatalogRecord;
    foaf:primaryTopic <http://gss-data.org.uk/data/trade/tariffs-on-goods-imported-into-the-uk-catalog-entry>;
    pmdcat:metadataGraph <http://gss-data.org.uk/graph/trade/tariffs-on-goods-imported-into-the-uk-metadata>;
    rdfs:label "Tariffs on goods imported into the UK";
    dcterms:issued "2021-04-26T13:49:39.547Z"^^xsd:dateTime;
    dcterms:modified "2021-04-26T13:49:39.547Z"^^xsd:dateTime;
    .

  <http://gss-data.org.uk/data/trade/tariffs-on-goods-imported-into-the-uk-catalog-entry>
    a pmdcat:Dataset;
    pmdcat:datasetContents <http://gss-data.org.uk/graph/trade/tariffs-on-goods-imported-into-the-uk>;
    pmdcat:graph <http://gss-data.org.uk/graph/trade/tariffs-on-goods-imported-into-the-uk> ;
    rdfs:label "Tariffs on goods imported into the UK";
    dcterms:issued "2021-04-26T13:49:39.547Z"^^xsd:dateTime;
    dcterms:modified "2021-04-26T13:49:39.547Z"^^xsd:dateTime;
    rdfs:comment "Use this service to check the UK Global Tariff that will apply to goods you import from 1 January 2021. You can also check the difference between what you pay now and what you’ll pay from 1 January 2021.";
    .

  <http://gss-data.org.uk/graph/trade/tariffs-on-goods-imported-into-the-uk> a pmdcat:GraphDatasetContents .
}

<http://gss-data.org.uk/graph/trade/tariffs-on-goods-imported-into-the-uk> {
  # contents of the dataset
}
JasonHowell commented 3 years ago

This needs to be reviewed by the Data managers and understood if this is still an issue.

Robsteranium commented 2 years ago

I realise there's quite a lot of room for improvement here but I wonder if we could at least prioritise getting the commodity codes reconciled so that we might demo a comparison of tariffs and trade flows? I think this sort of thing is vital to proving the worth of the whole endeavour!