GoogleCloudPlatform / datalake-modernization-workshops

Apache License 2.0
14 stars 11 forks source link

add Spark RAPIDS Dataproc lab #3

Closed mengdong closed 1 year ago

mengdong commented 1 year ago

add Spark RAPIDS Dataproc lab

velascoluis commented 1 year ago

Hi @mengdong , thank you so much for this PR, would it be possible to bootstrap the infra deployment with terraform?

Some examples here:

https://github.com/GoogleCloudPlatform/datalake-modernization-workshops/tree/main/s8s-spark-mlops/00-env-setup/terraform

https://github.com/GoogleCloudPlatform/datalake-modernization-workshops/tree/main/hive-to-bq-biglake/src/terraform

https://github.com/GoogleCloudPlatform/datalake-modernization-workshops/blob/main/admin-usecase/02-execution-instructions/terraform-execution.md

And for dataproc on GCE, specifically: https://github.com/anagha-google/dataproc-labs (lab #2)

Thanks again

mengdong commented 1 year ago

Thanks @velascoluis for your review! The infra is only cluster creation, do we still need to use terraform? I can do if it is a must.

mengdong commented 1 year ago

@velascoluis ping again on the question above, looking to having this merge asap, thanks!

velascoluis commented 1 year ago

Hi @mengdong not a must, just a nice to have. Will merge and open an issue for TF automation, feel free to take it :) Thanks again for your contrib