GSA / data.gov

Main repository for the data.gov service
https://data.gov
Other
547 stars 87 forks source link

Create Clear for Harvest2.0 Harvest source #4787

Open jbrown-xentity opened 2 weeks ago

jbrown-xentity commented 2 weeks ago

User Story

In order to be able to reset/restart a harvest source, data.gov admins want a clear function/API route.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

Background

Very helpful for testing, and occasionally useful for resetting a harvest source that has become corrupted or out of sync. Similar to CKAN clear functionality.

Security Considerations (required)

Should require authentication, but no security additions required.

Sketch

Eventually the CKAN removal piece may become so cumbersome (ie take so long, longer than the restart time [15 minutes?]) that we'll want to implement that piece as a subtask. For this instance, just utilize the API normally. Simply try to run a CKAN dataset purge. Also run the DB delete/clear commands. Ideally if everything is synced correctly, you should be able to remove the harvest jobs and let everything flow to delete the other foreign objects, but might require config changes or workarounds.