GSS-Cogs / family-trade

1 stars 2 forks source link

HMRC-trade-in-goods-by-business-characteristics #8

Open ajtucker opened 4 years ago

ajtucker commented 4 years ago
LPerryman commented 4 years ago
IeuanMan commented 4 years ago

@david-hull

https://docs.google.com/document/d/1lqBZg0z_N8dQoFoMeUZQx9RQAq9BO4TBk0HQhzDj88c/edit#heading=h.7783s8wimep8

BA Quality Assurance High Level Checks Dataset: HMRC UK Trade in Goods Statistics by Business Characteristics 2015 Done by: David Hull Date: 6-4-20 PMD V4 Is the dataset listed in PMD v4 staging? Yes Does the dataset open? Yes Is there descriptive metadata on the PMD v4 staging landing page? No Does transformed info seem to match the original? Yes Contents Modified Date 2020-03-10 PMD V3 Is the dataset listed in PMD V3? Yes Does the dataset open? Yes Is there descriptive metadata on the PMD v3 landing page? No Modified Date 26 Mar 2020 General Is the dataset title meaningful? Yes Do the filters work & look sensible? Yes Does the structure look sensible? Yes

Overview General comments on the dataset

Not the latest release - latest release is 2018 which is accessible from the website landing page. Title on PMDv4and PMDv3 is “UK Trade in Goods by Business Characteristics - Experimental Statistics”. Normally wouldn’t expect a date in the title i.e. 2015, rather a generic title to enable future updates. Metadata Is the Title, Publisher, Contact, Date Issued, description consistent with the source data

There is no metadata available. Short title listed in “about” but doesn’t appear in “explore cube” or main PMD page. No landing page or other info, etc. Metadata is present in airtable. Dimensions Do the column titles look sensible? Do the items look sensible? Does it match the original data?

Overall column titles and items look sensible and do seem to match original data ,however ,I think the columns “Count of Businesses” and “Count of Employees” may be able to be hidden as there dosen’t seem to be any data displayed in them. Limited filters i.e. no filter available on “age of business” or “Product”. Observations Is the PMD data consistent with the source data? Number of significant figures appropriate? Do they look sensible?

Overall, data seems consistent, to scale and looks sensible.

ajtucker commented 4 years ago

Reopening - Jenkins was closing these when the pipeline succeeded.

LPerryman commented 4 years ago

2018 data now available but was in a different format so pipeline had to be redeveloped. Left in column 'HMRC Trade Statistic Type' but could probably be removed.

Pipeline working in Jenkins but getting red screen of death on PMD3

JasonHowell commented 4 years ago

Can see this on PMDv4, is it ready to be reviewed by BA's?

JasonHowell commented 4 years ago

Needs a quick BA review. Then hopefully "good to go"

JasonHowell commented 4 years ago

BA's confirmed on today's stand-up "Good to go".

LPerryman commented 3 years ago

Published on PMD4

Robsteranium commented 3 years ago

Looks like the markers aren't quite right, e.g. this observation uses http://gss-data.org.uk/def/concept/cogs-markers/Suppressed instead of http://gss-data.org.uk/def/concept/cogs-markers/suppressed so the cube viewer shows a URI instead of a label.

This was missed by the PMD and QB validations as sdmxa:obsStatus is an attribute property. Perhaps we ought to add another validation? Something like "attribute values must have labels" would be too broad (they're not necessarily resources). We could check that "a coded-property's values must come from its codelist" (extending the similar dimension property validations to attributes). Even with the typo corrected, this would fail because cogs:suppressed isn't in the sdmx obsStatus codelist so we'd need to extend sdmxa:obsStatus qb:codeList <http://gss-data.org.uk/def/trade/concept-scheme/marker>, <http://gss-data.org.uk/def/concept-scheme/cogs-markers> or sub-property it.

I'd also suggest the "2019 - data tables" bit be removed from the title.

Robsteranium commented 3 years ago

That observation also has 2 undefined properties which appear to have URIs assigned by csvw (i.e. the columns didn't have csvw:propertyUrl defined):

Should these be measures in a multi-measure cube?

ajtucker commented 3 years ago

The workaround for attributes with literal values seems to be broken: https://ci.floop.org.uk/job/GSS_data/job/Trade/job/HMRC-trade-in-goods-by-business-characteristics/202/console#:~:text=%3Cipython-input-1-7cc4a609d99b%3E

GSS-Cogs/gss-utils#253 should fix this, at which point the workaround needs removing.

JasonHowell commented 3 years ago

Closing as this dataset has been published to Beta env.

Robsteranium commented 3 years ago

@JasonHowell a several of the above points haven't yet been resolved/ addressed and we don't appear to be tracking them elsewhere.

JasonHowell commented 3 years ago

@rossbowen based on the comments above from Robin, could you detail what actions are required to be taken forward?

Reopening as per Robin's comments as I understood the dataset had been published and observations addressed.