In order to be able to reset/restart a harvest source, data.gov admins want a clear function/API route.
Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
[ ] GIVEN the /harvest/{id}/clear API route is created \
WHEN I call the harvest API at /harvest/{id}/clear \
THEN the datasets are removed from CKAN \
AND the dataset records/errors/jobs are removed from the harvest DB
Background
Very helpful for testing, and occasionally useful for resetting a harvest source that has become corrupted or out of sync. Similar to CKAN clear functionality.
Should require authentication, but no security additions required.
Sketch
Eventually the CKAN removal piece may become so cumbersome (ie take so long, longer than the restart time [15 minutes?]) that we'll want to implement that piece as a subtask. For this instance, just utilize the API normally.
Simply try to run a CKAN dataset purge. Also run the DB delete/clear commands. Ideally if everything is synced correctly, you should be able to remove the harvest jobs and let everything flow to delete the other foreign objects, but might require config changes or workarounds.
User Story
In order to be able to reset/restart a harvest source, data.gov admins want a clear function/API route.
Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
/harvest/{id}/clear
API route is created \ WHEN I call the harvest API at/harvest/{id}/clear
\ THEN the datasets are removed from CKAN \ AND the dataset records/errors/jobs are removed from the harvest DBBackground
Very helpful for testing, and occasionally useful for resetting a harvest source that has become corrupted or out of sync. Similar to CKAN clear functionality.
Security Considerations (required)
Should require authentication, but no security additions required.
Sketch
Eventually the CKAN removal piece may become so cumbersome (ie take so long, longer than the restart time [15 minutes?]) that we'll want to implement that piece as a subtask. For this instance, just utilize the API normally. Simply try to run a CKAN dataset purge. Also run the DB delete/clear commands. Ideally if everything is synced correctly, you should be able to remove the harvest jobs and let everything flow to delete the other foreign objects, but might require config changes or workarounds.