tweag / sparkle

Haskell on Apache Spark.
BSD 3-Clause "New" or "Revised" License
447 stars 30 forks source link

Add examples of building a delta lake and doing a GWAS analysis #163

Closed zz1874 closed 1 year ago

zz1874 commented 2 years ago

These are two example applications to show that sparkle is compatible with deltalake and doing some genomics data analysis. For the first example, I've added a hello-deltalake app to simply include the io.delta package and create a data frame in delta format. For the second example, I've added a deltalake-glow app to include some real genomics data, use io.projectglow package to do the GWAS analysis, and use io.delta as a storage layer. The corresponding blog post draft is here.

facundominguez commented 2 years ago

I side-stepped the crashes in the buildkite job. If you rebase master that job should become green.

Maybe you would like to update the CI configuration so the buildkite job builds and runs your example.