spe-uob / 2020-HealthcareLake

A reasonably secure data lake for healthcare analytics
MIT License
9 stars 5 forks source link

4.4 ETL #73

Closed joekendal closed 3 years ago

joekendal commented 3 years ago

EMR cluster does ETL and curates the data marts to store in Redshift for structured data and S3 for unstructured data. The Redshift simply references the S3 location of the binary data whether that be an image. Also Redshift can be used to provide sources to BI tools.

joekendal commented 3 years ago

https://d0.awsstatic.com/whitepapers/enterprise-data-warehousing-on-aws.pdf