This repository contains sample Databricks notebooks found within the Databricks Selected Notebooks Jump Start and other miscellaneous locations.
The notebooks were created using Databricks in Python, Scala, SQL, and R; the vast majority of them can be run on Databricks Community Edition (sign up for free access via the link).
On-Time Flight Performance with GraphFrames for Apache Spark: Provides a jump start into Graph using GraphFrames
for Apache Spark on flight departure performance data.
ADAM Genomic Analysis using K-Means Clustering: Applying k-means clustering
to predict population sample location based on genomic sequences using ADAM
.
Streaming Meetup RSVPs is a series of notebooks showcasing how streaming on Databricks including the use of DataFrames
and mapWithState
.
adam: Genomic Sequencing using Apache Spark and ADAM
blog books: Notebooks to support the Databricks blog ebooks.
content: Various notebooks including
demo: Various notebooks including
dogfood: Various notebooks including
examples: Example notebooks in various stages of completion including Iris dataset k-means vs. bisecting k-means
flights: Various notebooks working with on-time flight performance
reporting: Example reporting notebooks including dashboard views