luminousmen / luminousmen.com

2 stars 0 forks source link

https://luminousmen.com/post/emr-serverless-a-400level-guide?utterances=b2c0071f4bf6b45e47541654ISwBNnZvaWzZ5OFMKp9VSgZcXNsXIEvZH%2BYoNtUhCJNQsAg2PyKVPqwdytPt8m23EI0RTUT7DsNgrPup8X5OMOFhl%2BLQ71Lb9f72OggItkMcS40oSKAR7WZ4pis%3D #53

Open utterances-bot opened 1 year ago

utterances-bot commented 1 year ago

EMR Serverless a 400-level guide - Blog | luminousmen

EMR Serverless provides a simpler solution to deploying big data applications, saving an engineer’s time from having to manage cluster configurations.

https://luminousmen.com/post/emr-serverless-a-400level-guide?utterances=b2c0071f4bf6b45e47541654ISwBNnZvaWzZ5OFMKp9VSgZcXNsXIEvZH%2BYoNtUhCJNQsAg2PyKVPqwdytPt8m23EI0RTUT7DsNgrPup8X5OMOFhl%2BLQ71Lb9f72OggItkMcS40oSKAR7WZ4pis%3D

mikenac commented 1 year ago

Good article. I agree with your critiques. Though, you can now see the jobs in the Spark History server, and you can now create Serverless using Terraform.

After doing the math, we found that Serverless would be about 50% more for our workloads. So I agree that short duration jobs are probably going to be the only place where this can compete dollar wise for now. If Amazon could get this down to about 15% difference, I would consider a switch, given the benefits to ease of management.

Ultimately, we are looking at using the "transient EMR cluster" features in Airflow to do standup and use of transient cluster resources until/unless this price gap can be reduced.

luminousmen commented 1 year ago

@mikenac Cool, that makes sense! Thanks for your comment!