datagovuk / ckanext-dgu

CKAN extension for data.gov.uk
http://data.gov.uk/
34 stars 33 forks source link

Organograms - Data wiped from CKAN dataset #540

Closed davidread closed 7 years ago

davidread commented 7 years ago

The CKAN dataset for CO Organograms had all its resources deleted a couple of days ago. (by the Drupal user, 29 Nov 2016 11:38). Needs investigating and restoring, please.

https://data.gov.uk/dataset/organogram-cabinet-office

davidread commented 7 years ago

Now FCO has had its data wiped, apart from the latest one. @ratajczak this really needs looking at.

https://data.gov.uk/dataset/organogram-foreign-and-commonwealth-office

ratajczak commented 7 years ago

I restored both of them and added extra logging.

davidread commented 7 years ago

I'll close it and reopen if it happens again

davidread commented 7 years ago

This has happened again: https://data.gov.uk/dataset/organogram-department-for-international-development

This is a serious issue now.

ratajczak commented 7 years ago

I've restored this dataset but unfortunately I've found that extra logging didn't work. I'll investigate this and make sure that it works for next time. Are you able to identify the request which caused this wipe out in CKAN logs?

davidread commented 7 years ago

Yes, all the resources were wiped by frontend4 (Drupal's user) on 20 Dec 2016 14:11. And then all the 3 resources for the 30/09/2016 organogram were added in 3 commits 20 Dec 2016 14:12. This tallies with the upload step (and possibly the sign-off and publish steps too), so do see if that sheds any light.

davidread commented 7 years ago

This has happened now with DfT: https://data.gov.uk/dataset/organogram-department-for-transport This is very serious!

ratajczak commented 7 years ago

I'll restore this dateset tonight. Could you please tell me when it happened? I should be able to find a clue in the logs

On 12 January 2017 at 15:53, David Read notifications@github.com wrote:

This has happened now with DfT: https://data.gov.uk/dataset/organogram-department-for-transport This is very serious!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datagovuk/ckanext-dgu/issues/540#issuecomment-272200072, or mute the thread https://github.com/notifications/unsubscribe-auth/AAv9gLnY6wKwGO8NxHV6LRiUideDTzWyks5rRkyAgaJpZM4LBEb8 .

davidread commented 7 years ago

I detailed the previous (Dfid) one. But you can see for yourself using the history: http://guidance.data.gov.uk/history.html i.e. https://data.gov.uk/dataset/history/organogram-department-for-transport

davidread commented 7 years ago

This has happened again: https://data.gov.uk/dataset/organogram-cabinet-office Please restore

davidread commented 7 years ago

And this one: https://data.gov.uk/dataset/organogram-environment-agency

davidread commented 7 years ago

Another: https://data.gov.uk/organogram/manage/independent-police-complaints-commission Please can you review them all?

ratajczak commented 7 years ago

I haven't found exactly what's going on but I'm able to reproduce this. It looks like these datasets have been corrupted by user 'script'. When I just upload a new organogram on any dataset previously modified by 'script' they get wiped out, I just tested this here: https://test.data.gov.uk/dataset/organogram-department-for-communities-and-local-government

ratajczak commented 7 years ago

I believe that all dataset are affected by this, until this is fixed I revoked permission to edit organograms on production from all users to avoid more datasets being wiped out.

ratajczak commented 7 years ago

Just to clarify, I'll try to fix this over this weekend

ratajczak commented 7 years ago

This is fixed and deployed. It was related to wrong date format. I've also fixed all datasets listed here and restored permissions to edit organograms by data publishers.

davidread commented 7 years ago

Thanks for finding the problem and doing the fix.

However there are plenty of datasets still affected. The script was run 18th October and all those that have published since are:

arts-and-humanities-research-council
attorney-generals-office
cabinet-office
care-quality-commission
centre-for-environment-fisheries-aquaculture-science
consumer-council-for-water
defence-science-and-technology-laboratory
department-for-education
department-for-environment-food-and-rural-affairs
department-for-international-development
department-for-transport
economic-and-social-research-council
engineering-and-physical-sciences-research-council
environment-agency
foreign-and-commonwealth-office
government-legal-department
her-majestys-revenue-and-customs
hm-crown-prosecution-service-inspectorate
human-fertilisation-and-embryology-authority
independent-police-complaints-commission
joint-nature-conservation-committee
medical-research-council
national-army-museum
natural-england
nhs-blood-and-transplant
office-of-rail-and-road
rural-payments-agency
single-source-regulations-office
student-loans-company-limited
the-british-library
transport-focus
united-kingdom-hydrographic-office
valuation-office-agency
veterinary-medicines-directorate
water-services-regulation-authority

csvsql ../ckan/organograms_public.csv --query "select publisher_name from organograms_public where strftime('%s', publish_date) > strftime('%s', '2016-10-17') group by publisher_name;"

So could you ensure they are all restored please?

Also, something strange is going on with the DCLG one, which needs restoring too: https://data.gov.uk/dataset/organogram-department-for-communities-and-local-government It looks like you did something to it on Saturday that made it all disappear - I think this must have been due to your testing because you mentioned the version on test.

ratajczak commented 7 years ago

I'll restore these datasets listed above but I I'm afraid that the problem can be wider. This wrong date format is an error introduced on 3 Sept 2016 not related to modification by 'script'. It just happened that all wiped out datasets listed here were modified by this user so this was my guess. These legacy resources added by 'script' had also date format issue but it would cause only them to disappear on dataset update, this is now fixed to. But I suspect that other datasets updated after 2 Sept are also affected, can you get a list of these datasets from CKAN database?

davidread commented 7 years ago

This is the list from 2nd Sept:

advisory-conciliation-and-arbitration-service
air-command
animal-and-plant-health-agency
animal-health-and-veterinary-laboratories-agency
appointments-commission
army-command
arts-and-humanities-research-council
arts-council-england
asset-protection-agency
attorney-generals-office
audit-commission
big-lottery-fund
biotechnology-and-biological-sciences-research-council
boundary-commission-for-england
boundary-commission-for-scotland
british-tourist-authority
british-transport-police-authority
cabinet-office
capital-for-enterprise-ltd
care-quality-commission
central-top-level-budget
centre-for-environment-fisheries-aquaculture-science
child-maintenance-and-enforcement-commission
children-and-family-court-advisory-and-support-service
childrens-workforce-development-council
civil-nuclear-police-authority
civil-service-learning
coal-authority
committee-on-climate-change
competition-commission
construction-industry-training-board
consumer-council-for-water
consumer-futures
council-for-healthcare-regulatory-excellence
criminal-cases-review-commission
criminal-injuries-compensation-authority
criminal-records-bureau
crown-prosecution-service
defence-equipment-and-support
defence-infrastructure-organisation
defence-science-and-technology-laboratory
defence-support-group
department-for-business-innovation-and-skills
department-for-communities-and-local-government
department-for-culture-media-and-sport
department-for-education
department-for-environment-food-and-rural-affairs
department-for-international-development
department-for-transport
department-for-work-and-pensions
department-of-education
department-of-energy-and-climate-change
department-of-health
driver-and-vehicle-licensing-agency
driver-vehicle-standards-agency
driving-standards-agency
economic-and-social-research-council
engineering-and-physical-sciences-research-council
engineering-construction-industry-training-board
english-heritage
environment-agency
equality-and-human-rights-commission
fco-services
fire-service-college
foreign-and-commonwealth-office
forestry-commission
gambling-commission
gangmasters-licensing-authority
general-social-care-council
government-actuarys-department
government-equalities-office
government-legal-department
government-procurement-service
greater-manchester-csu
head-office-and-corporate-services-mod
health-and-safety-executive
health-education-england
health-protection-agency
her-majestys-revenue-and-customs
high-speed-2
higher-education-funding-council-for-england
highways-agency
hm-crown-prosecution-service-inspectorate
hm-inspectorate-of-constabulary
hm-inspectorate-of-prisions
hm-inspectorate-of-probation
hm-passport-office
hm-treasury
home-office
homes-and-communities-agency
horniman-public-museum-public-park-trust
horserace-betting-levy-board
human-fertilisation-and-embryology-authority
human-tissue-authority
identity-passport-service
imperial-war-museum
independent-housing-ombudsman
independent-living-fund
independent-offices
independent-police-complaints-commission
independent-safeguarding-authority
information-commissioners-office
infrastructure-planning-commission
intellectual-property-office
joint-forces-command
joint-nature-conservation-committee
judicial-appointments-commission
judicial-office
land-registry
learning-and-skills-improvement-service
leasehold-advisory-service
legal-aid-agency
legal-services-board
legal-services-commission
london-thames-gateway-development-corporation
marine-management-organisation
maritime-and-coastguard-agency
medical-research-council
medicines-and-healthcare-products-regulatory-agency
met-office
ministry-of-defence
ministry-of-justice
monitor
museum-of-science-and-industry
national-army-museum
national-college-for-leadership-of-schools-and-childrens-services
national-college-for-school-leadership
national-fraud-authority
national-heritage-memorial-fund
national-institute-for-health-and-care-excellence
national-lottery-commission
national-maritime-museum
national-measurement-office
national-museum-of-science-and-industry
national-museums-liverpool
national-offender-management-service
national-patient-safety-agency
national-policing-improvement-agency
national-portrait-gallery
national-treatment-agency-for-substance-misuse
natural-england
natural-environment-research-council
natural-history-museum
navy-command
nhs-arden-and-greater-east-midlands-csu
nhs-arden-csu
nhs-blood-and-transplant
nhs-business-services-authority
nhs-central-southern-csu
nhs-cheshire-and-merseyside-commissioning-support-unit
nhs-digital
nhs-england
nhs-greater-east-midlands-csu
nhs-information-centre-for-health-and-social-care
nhs-litigation-authority
nhs-midlands-and-lancashire-csu
nhs-north-of-england-csu
nhs-north-yorkshire-and-humber-csu
nhs-south-csu
nhs-south-west-csu
nhs-trust-development-authority
northern-ireland-human-rights-commission
northern-ireland-office
nuclear-decommissioning-authority
office-for-budget-responsibility
office-for-fair-access
office-for-national-statistics
office-for-standards-in-education-childrens-services-and-skills
office-of-qualifications-and-examinations-regulation
office-of-rail-and-road
office-of-the-advocate-general-of-scotland
office-of-the-childrens-commissioner
office-of-the-immigration-services-commissioner
olympic-delivery-authority
ordnance-survey
parades-commission-for-northern-ireland
parole-board
partnerships-for-schools
pensions-ombudsman
permanent-joint-headquarters
planning-inspectorate
prisons-and-probation-ombudsman
public-health-england
public-lending-right
qualifications-and-curriculum-development-agency
queen-elizabeth-ii-conference-centre
royal-air-force-museum
royal-armouries
royal-museums-greenwich
rural-payments-agency
science-and-technology-facilities-council
science-museum-group
scotland-office
scottish-government
scottish-law-commission
security-industry-authority
serious-fraud-office
serious-organised-crime-agency
single-source-regulations-office
sir-john-soanes-museum
skills-funding-agency
south-central-and-west-csu
sport-england
standards-and-testing-agency
standards-board-for-england
student-loans-company-limited
tate
tenant-services-authority
the-british-library
the-british-museum
the-disclosure-and-barring-service
the-food-and-environment-research-agency
the-geffrye-museum
the-insolvency-service
the-national-archives
the-national-gallery
the-national-museum-of-the-royal-navy
the-northern-lighthouse-board
the-pensions-advisory-service
the-pensions-regulator
the-wallace-collection
thurrock-thames-gateway-development-corporation
training-and-development-agency-for-schools
transport-focus
treasury-solicitors-department
trinity-house-lighthouse-service
uk-anti-doping
uk-border-agency
uk-commission-for-employment-and-skills
uk-debt-management-office
uk-sport
uk-statistics-authority
uk-supreme-court
united-kingdom-atomic-energy-authority
united-kingdom-hydrographic-office
valuation-office-agency
valuation-tribunal-service
vehicle-and-operator-services-agency
vehicle-certification-agency
veterinary-medicines-directorate
victoria-albert-museum
visit-england
wales-office
water-services-regulation-authority
west-and-south-yorkshire-and-bassetlaw-commissioning-support-unit
west-northamptonshire-development-corporation
young-peoples-learning-agency
youth-justice-board

csvsql ../ckan/organograms_public.csv --query "select publisher_name from organograms_public where strftime('%s', publish_date) > strftime('%s', '2016-09-01') group by publisher_name;"

You can prepend 'organogram-' to get the dataset name.

ratajczak commented 7 years ago

Thanks for that, I just checked over a dozen of datasets from above list and they were all fine so it looks like only these modified by script are affected. I'll restore them.

davidread commented 7 years ago

Sounds good

ratajczak commented 7 years ago

I've reviewed and restored those since 18 October and department-for-communities-and-local-government.

davidread commented 7 years ago

I should have closed this when it was fixed on 21 Jan.