SciSpark / Conference

A collection of resources for running trainings on SciSpark
Apache License 2.0
3 stars 1 forks source link

Draft curriculum for 201 #2

Open wmburke opened 8 years ago

wmburke commented 8 years ago

SciSpark 202: Algorithms for MCC Search and PDF Clustering using SciSpark

Abstract/Agenda:

We introduce a 3 part course module on SciSpark, our AIST14 funded project for Highly Interactive and Scalable Climate Model Metrics and Analytics. The three part course session introduces a 101, 202, and 303 class for learning how to use Spark for science.

SciSpark 202 is a 1.5 hour session teacing two algorithms representative of the motivation for SciSpark - iterative data-reuse algorithms that share information between multiple stages. We will build on SciSpark 101 and Scala for science programming as an entry-course. The first algorithm will be an iterative graph-based algorithm for identifying Mesoscale Convective Complexes in Satellite Infrared data:

We will demonstrate its implementation in SciSpark and discuss future directions.

The second algorithm is a K-means clustering algorithm for identification of Probability Density Functions (PDFs) for Climate Extremes in the North American Regional Climate Change Assessment Program (NARCCAP) data:

wmburke commented 8 years ago

@sujen1412 You are also implicated in this task.

The goal is to provide the following info in curriculum202.md

Curriculum outline:

wmburke commented 8 years ago

https://docs.google.com/document/d/1-KX99wH1E6AeFQJTaTwowCMsJknDuJtpPwXNsQswl3Y/edit

valeriearoth commented 8 years ago

Made a new doc so I could refer back to the old doc while working on it: https://docs.google.com/a/utexas.edu/document/d/1z8unGUJmSTV9_518qi5GRGv1e8rnpNjkEhzn-2UsalI/edit?usp=sharing