emgrasmeder / emc-aws-data-project

Using Python, Boto, Pandas, and maybe Spark to analyze a bit of data on AWS
1 stars 1 forks source link

PySpark / Spark #4

Open emgrasmeder opened 9 years ago

emgrasmeder commented 9 years ago

In addition to pandas, this project might want to use Spark. Particularly PySpark (or Sparkling Pandas). There's some data sets in the datasets directory. Add more or use that stuff while you play around with Spark and push anything you find.