Open ajtucker opened 4 years ago
@david-hull
BA Quality Assurance High Level Checks Dataset: HMRC UK Trade in Goods Statistics by Business Characteristics 2015 Done by: David Hull Date: 6-4-20 PMD V4 Is the dataset listed in PMD v4 staging? Yes Does the dataset open? Yes Is there descriptive metadata on the PMD v4 staging landing page? No Does transformed info seem to match the original? Yes Contents Modified Date 2020-03-10 PMD V3 Is the dataset listed in PMD V3? Yes Does the dataset open? Yes Is there descriptive metadata on the PMD v3 landing page? No Modified Date 26 Mar 2020 General Is the dataset title meaningful? Yes Do the filters work & look sensible? Yes Does the structure look sensible? Yes
Overview General comments on the dataset
Not the latest release - latest release is 2018 which is accessible from the website landing page. Title on PMDv4and PMDv3 is “UK Trade in Goods by Business Characteristics - Experimental Statistics”. Normally wouldn’t expect a date in the title i.e. 2015, rather a generic title to enable future updates. Metadata Is the Title, Publisher, Contact, Date Issued, description consistent with the source data
There is no metadata available. Short title listed in “about” but doesn’t appear in “explore cube” or main PMD page. No landing page or other info, etc. Metadata is present in airtable. Dimensions Do the column titles look sensible? Do the items look sensible? Does it match the original data?
Overall column titles and items look sensible and do seem to match original data ,however ,I think the columns “Count of Businesses” and “Count of Employees” may be able to be hidden as there dosen’t seem to be any data displayed in them. Limited filters i.e. no filter available on “age of business” or “Product”. Observations Is the PMD data consistent with the source data? Number of significant figures appropriate? Do they look sensible?
Overall, data seems consistent, to scale and looks sensible.
Reopening - Jenkins was closing these when the pipeline succeeded.
2018 data now available but was in a different format so pipeline had to be redeveloped. Left in column 'HMRC Trade Statistic Type' but could probably be removed.
Pipeline working in Jenkins but getting red screen of death on PMD3
Can see this on PMDv4, is it ready to be reviewed by BA's?
Needs a quick BA review. Then hopefully "good to go"
BA's confirmed on today's stand-up "Good to go".
Published on PMD4
Looks like the markers aren't quite right, e.g. this observation uses http://gss-data.org.uk/def/concept/cogs-markers/Suppressed instead of http://gss-data.org.uk/def/concept/cogs-markers/suppressed so the cube viewer shows a URI instead of a label.
This was missed by the PMD and QB validations as sdmxa:obsStatus
is an attribute property. Perhaps we ought to add another validation? Something like "attribute values must have labels" would be too broad (they're not necessarily resources). We could check that "a coded-property's values must come from its codelist" (extending the similar dimension property validations to attributes). Even with the typo corrected, this would fail because cogs:suppressed
isn't in the sdmx obsStatus codelist so we'd need to extend sdmxa:obsStatus qb:codeList <http://gss-data.org.uk/def/trade/concept-scheme/marker>, <http://gss-data.org.uk/def/concept-scheme/cogs-markers>
or sub-property it.
I'd also suggest the "2019 - data tables" bit be removed from the title.
That observation also has 2 undefined properties which appear to have URIs assigned by csvw (i.e. the columns didn't have csvw:propertyUrl
defined):
Should these be measures in a multi-measure cube?
The workaround for attributes with literal values seems to be broken: https://ci.floop.org.uk/job/GSS_data/job/Trade/job/HMRC-trade-in-goods-by-business-characteristics/202/console#:~:text=%3Cipython-input-1-7cc4a609d99b%3E
GSS-Cogs/gss-utils#253 should fix this, at which point the workaround needs removing.
Closing as this dataset has been published to Beta env.
@JasonHowell a several of the above points haven't yet been resolved/ addressed and we don't appear to be tracking them elsewhere.
@rossbowen based on the comments above from Robin, could you detail what actions are required to be taken forward?
Reopening as per Robin's comments as I understood the dataset had been published and observations addressed.
observations.csv-metadata.trig
CSVWMetadata('https://gss-cogs.github.io/family-trade/reference/')