alacer / renaissance

Detailed Visualization of Large Complex Data in R
0 stars 0 forks source link

End-to-end story #10

Open edsar opened 9 years ago

edsar commented 9 years ago

Feedback is that to sell the service, we need to show a simple yet complete story illustrating: 1) Here's the messy data 2) Here's how we cloak it (simple approach that can be extended with R) - perhaps outside of Renaissance 3) Here's how we clean it (simple approach that can be extended with R) - perhaps outside of Renaissance 4) Here's how we analyze it (default is Renaissance) 5) Here's how to pull together and export a summary report 6) Viola! Messy data to clean report!

edsar commented 9 years ago

Will do powerpoint and screencast

ahoffer commented 9 years ago

What messy data set would you like to use?

World Bank?

Or something from the Data Quality as a Service presentation? Like the "Health Care" example, "Banking & Finance" example, or the "Online/e-commerce" example?

Or maybe the political donors because we have already taken that one from end-to-end in real life?

Once we have a data set, we build a story around it.

edsar commented 9 years ago

WorldBank (which includes other related data sources in the same domain)

Sent from my Windows Phone


From: Aaron Hoffermailto:notifications@github.com Sent: ý1/ý2/ý2015 12:00 To: alacer/renaissancemailto:renaissance@noreply.github.com Cc: ed@alacergroup.commailto:ed@alacergroup.com Subject: Re: [renaissance] End-to-end story (#10)

What messy data set would you like to use?

World Bank?

Or something from the Data Quality as a Service presentation? Like the "Health Care" example, "Banking & Finance" example, or the "Online/e-commerce" example?

Or maybe the political donors because we have already taken that one from end-to-end in real life?

Once we have a data set, we build a story around it.

— Reply to this email directly or view it on GitHubhttps://github.com/alacer/renaissance/issues/10#issuecomment-68557145.

ahoffer commented 9 years ago

Good news! I don't see anything in this particular E2E scenario that needs Loopback. The World Bank data is already stored as a CSV file.

edsar commented 9 years ago

What about the idea of loading the CSV (or another data source) prior to cleaning? Don't we want an architecture that allows us to apply an R cleaning script to a data source?

Ed

Sent from my Windows Phone


From: Aaron Hoffermailto:notifications@github.com Sent: ý1/ý2/ý2015 12:11 To: alacer/renaissancemailto:renaissance@noreply.github.com Cc: ed@alacergroup.commailto:ed@alacergroup.com Subject: Re: [renaissance] End-to-end story (#10)

Good news! I don't see anything in this particular E2E scenario that needs Loopback. The World Bank data is already stored as a CSV file.

— Reply to this email directly or view it on GitHubhttps://github.com/alacer/renaissance/issues/10#issuecomment-68557948.

ahoffer commented 9 years ago

Maybe what we need is multiple use cases. I'll create some use cases for accessing data.

edsar commented 9 years ago

Let's start with Scenario 1 for integrating Analyze and Cloaking/Decloaking: 1) [Cloak] Customer provides sensitive data (customer dataset) 2) [Cloak] System cloaks data and provides public/private files to user (customer dataset) 3) [Prepare] Show capabilities with "tidying" data (customer dataset) 4) [Analyze] Analyze cloaked individual data using uncloaked dimension (cloaked customers across uncloaked cities/states) 5) [Uncloak] Return an uncloaked, augmented csv (or xlsx) to the user