Open pwalsh opened 7 years ago
cc @jobarratt
@danfowler as a first step please can you estimate the time needed for these tasks
@danfowler has already achieved outputs 1 and 2 for the REFIT data. Do you plan to implement outputs 1 - 4 for all of the datasets we have access to (Enliten, REFIT, Apatsche)?
We've also recently gained access to another dataset from Loughborough, confusingly also under the REFIT umbrella, structured very differently from the REFIT data we already have. Do you want to run the pilot on that data as well?
@cblop yes, I plan on doing for all the datasets we have access to. (Do you have any insight on how to model Apatsche (https://github.com/frictionlessdata/pilot-dm4t/issues/17)?)
I could optionally redo REFIT very simply as a datapackage-pipeline, but probably not necessary if we make adhere to same flow as the others.
Re: Loughborough, can you link to the dataset? I imagine if it is straightforward enough to package we can do, but our aim is to close this off rather soon.
@pwalsh
.to_aws
dumper (right, @akariv) to package the whole process.@jobarratt estimates in Trello
We will skip the Apatsche ( See https://github.com/frictionlessdata/pilot-dm4t/issues/17 )
@jobarratt
I've updated the task list in the first issue description.
Whoever takes this on needs to do #21 and then come back here to complete the rest of the tasks.
Description
We want to close off a small, achievable, and meaningful pilot for the dm4t data. After internal discussion, we believe we can do so in a relatively short time.
Further, we think this will demonstrate some simple yet important and powerful steps for other pilots, and the community at large, in using Frictionless Data specifications and tooling to "progressively enhance" raw data - especially data like this which is the output of a much bigger research project.
There are four high-level outputs for this pilot:
Tasks
@vitorbaptista can grant access for S3 and Elasticsearch service