GSS-Cogs / family-trade

1 stars 2 forks source link

ONS-UK-SA-Trade-in-goods #2

Closed ajtucker closed 3 years ago

ajtucker commented 4 years ago

Feedback from DIT:

some of the geography labels don't match the original spreadsheet. Most notably "Total EU 28" is labelled as "ACP" in the GSS data and "Total Extra EU 28 (Rest of World)" is labelled as "Extra EU-25 - world less EU-25 countries"

ajtucker commented 4 years ago
IeuanMan commented 4 years ago

@david-hull

https://docs.google.com/document/d/1YutxOExPSV5w4tSZ1jH20SXzOZLwQQ9GqUj-XWzSW7c/edit

BA Quality Assurance High Level Checks Dataset: Trade in goods: all countries, seasonally adjusted Done by: David Hull Date: 6-4-20 PMD V4 Is the dataset listed in PMD v4 staging? Yes Does the dataset open? Yes Is there descriptive metadata on the PMD v4 staging landing page? Yes Does transformed info seem to match the original? Yes Contents Modified Date 2020-03-10 PMD V3 Is the dataset listed in PMD V3? Yes Does the dataset open? Yes Is there descriptive metadata on the PMD v3 landing page? Yes Modified Date 26 Mar 2020 General Is the dataset title meaningful? Yes Do the filters work & look sensible? Mostly Does the structure look sensible? Yes

Overview General comments on the dataset

Website landing page only has the current release. Very large XLS spreadsheet but with simple structure and dimensions. Metadata Is the Title, Publisher, Contact, Date Issued, description consistent with the source data

Metadata is present in the “about” page. No link available to the original website landing page. Only a generic email address in contact detail, whereas a specific contact name is present in airtable. Dimensions Do the column titles look sensible? Do the items look sensible? Does it match the original data?

No filter available for “Reference Period”. Looks odd to me to have a “GBP Total” column of it’s own with “GBP total” the main (if not only) entry in the “measure type”. Observations Is the PMD data consistent with the source data? Number of significant figures appropriate? Do they look sensible?

Overall, data seems consistent, to scale and looks sensible.

ajtucker commented 4 years ago

VCard style contact details need to be added to gss-utils GSS-Cogs/csvcubed#504.

Shannon95 commented 4 years ago

Comments

Shannon95 commented 4 years ago

Comments on issues fixed and what is remaining:

BA Comments with response: Only a generic email address in contact detail, whereas a specific contact name is present in airtable. : Current limitation as the email is just scared at the moment, from my understanding this is being looked into for future dev.

No link available to the original website landing page. : From my understanding this is currently not a feature.

All technical issues noted fixed from a DE side and filter issues from Swirl side. Moving to Needs Sign off.

JasonHowell commented 4 years ago

BA's have confirmed it is good to go in its current state.

JasonHowell commented 4 years ago

Closing issue as now been deployed onto PMDv4.

canwaf commented 3 years ago

Rewritten and simplified the mechanism to convert string periods using pandas' to_datetime functionality. Added the datamarker which is applicable for the entire dataset. N/A values left as is, and not defined in the datamarker field as why they are not applicable isn't defined within the dataset.

LPerryman commented 3 years ago

Changed Marker to represent not-applicable for countries that data is not collected for and added more information to the metadata description. Published on PMD4 but probably needs changes more changes to the metadata.

LPerryman commented 3 years ago

PMD4 Trade in goods: all countries, seasonally adjusted https://staging.gss-data.org.uk/cube/explore?uri=http%3A%2F%2Fgss-data.org.uk%2Fdata%2Fgss_data%2Ftrade%2Fons-uk-sa-trade-in-goods-catalog-entry

rossbowen commented 3 years ago

Build failing atm.

mikeAdamss commented 3 years ago

was poking around the failed builds to make sure the databaker changes didn't break anything. though I'd leave a note on this as its a confusing one due to the previous issues with the ONS scraper.

previously, the scraper was calling latest=True but essentially returning a random distribution, if you look here: https://ci.floop.org.uk/job/GSS_data/job/Trade/job/ONS-UK-SA-Trade-in-goods/106/artifact/datasets/ONS-UK-SA-Trade-in-goods/out/trace.json (the build that actually worked) you can see from the source field that it was actually using the oldest (v1) distribution of the data.

so the good news - it's bringing in the latest distribution properly now. The bad news - it's broke because the transform needs to be updated to work with the newer (and more complex) distribution.

mikeAdamss commented 3 years ago

working now. Newest version of the data had data markers which the older script was unceremoniously dropping in the background (hence odd exceptions). The marker was N/A which I've taken as not-applicable but it should be easy to tweak if you're using something else these days.