aws-samples / emr-serverless-samples

Example code for running Spark and Hive jobs on EMR Serverless.
https://aws.amazon.com/emr/serverless/
MIT No Attribution
150 stars 74 forks source link

Add example of genomics analysis using Glow #1

Closed dacort closed 2 years ago

dacort commented 2 years ago

Glow is a popular toolkit for Genomics analysis, but requires packaging both a Python module and jar packages to be able to run PySpark code.

This PR demonstrates how to build the artifacts for EMR Serverless and run a simple Glow Spark job using 1000 Genomes data.