uchicago-computation-workshop / charlie_catlett

Repository for Charlie Catlett's presentation at the CSS Workshop (11/8/2018)
1 stars 0 forks source link

Research pipelines and data users #13

Open LeosonH opened 6 years ago

LeosonH commented 6 years ago

Thank you for presenting!

Many of my coursemates have raised questions about scalability and flexibility. I am also interested in this aspect, but specifically on the side of potential projects and research partners that will make use of the AoT data.

The AoT architecture is designed to be extensible, and will over time likely incorporate more and more nodes and sensors of varying types and complexity - creating a comprehensive and granular source of data for urban research. How would you go about implementing project pipelines to increase the ease of which users of the data can sieve through the pieces of this massive collection, and isolate subsets or target specific attributes of the data pertinent to their questions? Will a comprehensive ontology/metadata framework also be included?

A relevant example might be triage - a general toolkit for predictive analytics and risk modeling projects being developed by Data Science for Social Good fellows. The toolkit aims to provide interfaces to different phases of a project as defined by configurations specific to project needs, such as implementing database schemas, building test matrices, feature generation, etc.