DataRescue Workflow

This guide describes the DataRescue workflow we use for DataRescue activities as developed by the DataRefuge project and EDGI, both at in-person events and when people work remotely. It explains the process that a URL/dataset goes through from the time it has been identified, either by a Seeder as "uncrawlable," or by other means, until it is made available as a record in the datarefuge.org CKAN data catalog. The process involves several stages, and is designed to maximize smooth hand-offs so that each phase is handled by someone with distinct expertise in the area they're tackling, while the data is always being tracked for security.

Note: This workflow is no longer supported as of May 21, 2017

Are you looking for the actual documentation?

We have moved the documentation to a more user-friendly format. You can now find the guide at datarefuge.github.io/workflow.

Note that we are still working on it, and will shortly add screenshots, etc.

Contributing to this guide

Suggestions and improvements are welcome! All changes to the guide are managed through this GitHub repository. Please check our contribution guidelines for details.

Partners

DataRescue is a broad, grassroots effort with support from numerous local and nationwide networks. DataRefuge and EDGI partner with local organizers in supporting these events. See more of our institutional partners on the DataRefuge home page.

datarefuge / workflow

readme

DataRescue Workflow

Note: This workflow is no longer supported as of May 21, 2017

Are you looking for the actual documentation?

Contributing to this guide

Partners