Open wmburke opened 8 years ago
@sujen1412 You are also implicated in this task.
The goal is to provide the following info in curriculum202.md
Curriculum outline:
Made a new doc so I could refer back to the old doc while working on it: https://docs.google.com/a/utexas.edu/document/d/1z8unGUJmSTV9_518qi5GRGv1e8rnpNjkEhzn-2UsalI/edit?usp=sharing
SciSpark 202: Algorithms for MCC Search and PDF Clustering using SciSpark
Abstract/Agenda:
We introduce a 3 part course module on SciSpark, our AIST14 funded project for Highly Interactive and Scalable Climate Model Metrics and Analytics. The three part course session introduces a 101, 202, and 303 class for learning how to use Spark for science.
SciSpark 202 is a 1.5 hour session teacing two algorithms representative of the motivation for SciSpark - iterative data-reuse algorithms that share information between multiple stages. We will build on SciSpark 101 and Scala for science programming as an entry-course. The first algorithm will be an iterative graph-based algorithm for identifying Mesoscale Convective Complexes in Satellite Infrared data:
We will demonstrate its implementation in SciSpark and discuss future directions.
The second algorithm is a K-means clustering algorithm for identification of Probability Density Functions (PDFs) for Climate Extremes in the North American Regional Climate Change Assessment Program (NARCCAP) data: