GoogleCloudDataproc / initialization-actions

Run in all nodes of your cluster before the cluster starts - lets you customize your cluster
https://cloud.google.com/dataproc/init-actions
Apache License 2.0
586 stars 513 forks source link

YARN Applications using docker container #339

Closed lomluca-zz closed 6 years ago

lomluca-zz commented 6 years ago

Hi,

I would like to know if is it possible to configure YARN in order to run Container inside Docker images (for example preparing an environment with this type of configuration: https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/DockerContainers.html)

Thanks!

karth295 commented 6 years ago

Yup, you can set configuration values when creating a cluster with --properties (docs).

You can also check out Google Container Registry as a convenient place to store docker images in a way that's accessible to the nodes in your cluster.

lomluca-zz commented 6 years ago

Maybe my question was not so clear: I actually tried with --properties, but it is impossible, because I need to configure container-executer.cfg, a file not available in the exposed set. If I do it with an init-script, it doesn't work anyway because it is executed at the end and the NodeManagers fail during the startup.