datatogether / roadmap

Coordinating technical work & roadmapping additional services
8 stars 3 forks source link

Goals of Data Rescue #2

Closed jonganc closed 7 years ago

jonganc commented 7 years ago

I feel like Datarescue suffers a bit because the goals are not clearly stated. Then, if I have an idea about working on something that other people aren't, it's not clear to me if that's simply because of insufficient people or because it's not important.

I have identified what I have inferred are important goals and subgoals. I'm not sure if deciding on goals also involves prioritizing them (or setting timelines?).

dcwalk commented 7 years ago

@jonganc -- It is a fair point, however there are a couple of overlapping activities going on which makes it difficult to untangle goals.

"DataRescue" refers to the events that are supported by EDGI and Data Refuge but organized locally. The goals of DataRescues are really established by local organizers. Data Refuge and EDGI obviously influence those based on our interests/goals and those are seen in the event paths.

EDGI's goals are listed here:

We are building online tools, helping events, and creating research networks to proactively preserve, archive and track public environmental data and ensure its continued availability. We are indexing millions of government web pages on a weekly basis, tracking changes to them, and producing regular reports

So, to try and track those you mentioned...:

The questions about kinds of data, organizing, making available are ones that kind of touch upon how involvement with partners (again, Data Refuge, but also Internet Archive, etc...) is structured. In addition, they are shaped from the interest from people who want to contribute to technical projects.

I hope that they in tern also reflect/are complementary to the ongoing efforts from various others involved in ongoing data preservation efforts

jonganc commented 7 years ago

OK. I admit that I was confused about the organizations involved but I think I'm getting more clarity now. I think in retrospect what I was asking about was a spec. In other words, if God came down and said he'd magically build whatever we wanted, what would you tell him to build? Since the project is dynamic, it might be OK to be guided by slightly open ended statement like the one you refer to. However (and I think this ties into this issue), in that case, I think we should still perhaps note the ongoing "tracks", e.g. there could be a website monitoring track with associated goals and a dataset scraping track with associated goals. That way, if someone was interested in working on the technical aspects of the EDGI archiving project, they would have an idea of the project's extents at a given moment, what it might needed, and maybe even be able to pitch new ideas while knowing where they might fit in.

jonganc commented 7 years ago

I think this has overlap with what @b5 wrote in https://github.com/edgi-govdata-archiving/proposed-services/blob/master/workflow_changes.md

dcwalk commented 7 years ago

With the phasing out of DataRescue event support I'm going to go ahead and close this issue. Please reopen if you feel it needs to be revisited