big-data-europe / docker-spark

Apache Spark docker image
2.04k stars 697 forks source link

entrypoint.sh not executed when starting the container #22

Closed Brandonage closed 6 years ago

Brandonage commented 6 years ago

I'm using this docker image to launch a series of slave/masters and execute workloads using hdfs as the file system. I set the environmental variables needed such as CORE_CONF_fs_defaultFS or CLUSTER_NAME but they don't seem to have any effect on the hadoop configuration files. I've realised that the script entrypoint.sh that uses these environmental variables is not executed. This script is included in the docker-hadoop image, which is the base of this docker-spark image.

Now, I'm not sure if launching a docker container means launching all of the inherited entry points of the parent images but the HDFS configuration part is not working.

earthquakesan commented 6 years ago

Hi @Brandonage ! Which version/branch are you using? please provide minimal docker-compose which recreate your issue.

Brandonage commented 6 years ago

It's basically like this. Is not a docker-compose but a Marathon JSON. It's understandable anyways with the version of the image under the image entry.

{ "id": "/hdfssparkstandalone", "groups": [ { "id": "/hdfssparkstandalone/namenode", "apps":[ { "id": "/hdfssparkstandalone/namenode/namenode", "cpus": 1, "mem": 2048, "container": { "type": "DOCKER", "docker": { "image": "uhopper/hadoop-namenode:2.8.1", "forcePullImage": true }, "volumes": [ { "containerPath": "/hadoop/dfs/name", "hostPath": "/home/vagrant/name", "mode": "RW" } ], "portMappings": [ { "hostPort": 0, "containerPort": 8020}, { "hostPort": 0, "containerPort": 50070 } ] }, "networks": [ { "mode": "container", "name": "dcos" } ], "env" : { "MULTIHOMED_NETWORK" : "0", "CLUSTER_NAME" : "hdfs-cluster",

"HDFS_CONF_dfs_namenode_datanode_registration_iphostnamecheck" : "false", "HDFS_CONF_dfs_client_use_datanode_hostname" : "false", "HDFS_CONF_dfs_datanode_use_datanode_ip_hostname" : "false", "HDFS_CONF_dfs_datanode_use_datanode_hostname" : "false" }, "instances": 1 } ] }, { "id": "/hdfssparkstandalone/datanode", "dependencies": ["/hdfssparkstandalone/namenode"], "apps":[ { "id": "/hdfssparkstandalone/datanode/datanode", "cpus": 1, "mem": 2048, "container": { "type": "DOCKER", "docker": { "image": "uhopper/hadoop-datanode:2.8.1", "forcePullImage": true }, "portMappings": [ { "hostPort": 0, "containerPort": 50075} ], "volumes": [ { "containerPath": "/hadoop/dfs/data", "hostPath": "/home/vagrant/data", "mode": "RW" } ] }, "networks": [ { "mode": "container", "name": "dcos" } ], "instances": @ndatanodes@, "constraints": [["hostname", "GROUP_BY"]], "env" : { "MULTIHOMED_NETWORK" : "0", "CLUSTER_NAME" : "hdfs-cluster", "CORE_CONF_fs_defaultFS" : "hdfs://namenode-namenode-hdfssparkstandalone.marathon-user.containerip.dcos.thisdcos.directory:8020",

"HDFS_CONF_dfs_namenode_datanode_registration_iphostnamecheck" : "false", "HDFS_CONF_dfs_client_use_datanode_hostname" : "false", "HDFS_CONF_dfs_datanode_use_datanode_ip_hostname" : "false", "HDFS_CONF_dfs_datanode_use_datanode_hostname" : "false" } } ] }, { "id": "/hdfssparkstandalone/sparkmaster", "dependencies": ["/hdfssparkstandalone/namenode"], "apps":[ { "id": "/hdfssparkstandalone/sparkmaster/sparkmaster", "cpus": 1, "mem": 4096, "container": { "type": "DOCKER", "docker": { "image": "bde2020/spark-master:2.1.0-hadoop2.8-hive-java8", "forcePullImage": true }, "portMappings": [ { "hostPort": 0, "containerPort": 8080}, { "hostPort": 0, "containerPort": 7077 } ] }, "networks": [ { "mode": "container", "name": "dcos" } ], "instances": 1, "env" : { "MULTIHOMED_NETWORK" : "0", "CLUSTER_NAME" : "hdfs-cluster", "CORE_CONF_fs_defaultFS" : "hdfs://namenode-namenode-hdfssparkstandalone.marathon-user.containerip.dcos.thisdcos.directory:8020",

"HDFS_CONF_dfs_namenode_datanode_registration_iphostnamecheck" : "false", "HDFS_CONF_dfs_client_use_datanode_hostname" : "false", "HDFS_CONF_dfs_datanode_use_datanode_ip_hostname" : "false", "HDFS_CONF_dfs_datanode_use_datanode_hostname" : "false", "YARN_CONF_yarn_logaggregationenable" : "true" } } ] }, { "id": "/hdfssparkstandalone/sparkslave", "dependencies": ["/hdfssparkstandalone/sparkmaster"], "apps":[ { "id": "/hdfssparkstandalone/sparkslave/sparkslave", "cpus": 1, "mem": 6144, "container": { "type": "DOCKER", "docker": { "image": "bde2020/spark-worker:2.1.0-hadoop2.8-hive-java8", "forcePullImage": true }, "portMappings": [ { "hostPort": 0, "containerPort": 8081 } ] }, "networks": [ { "mode": "container", "name": "dcos" } ], "instances": @nslaves@, "constraints": [["hostname", "GROUP_BY"]], "env" : { "MULTIHOMED_NETWORK" : "0", "CLUSTER_NAME" : "hdfs-cluster", "CORE_CONF_fs_defaultFS" : "hdfs://namenode-namenode-hdfssparkstandalone.marathon-user.containerip.dcos.thisdcos.directory:8020",

"HDFS_CONF_dfs_namenode_datanode_registration_iphostnamecheck" : "false", "HDFS_CONF_dfs_client_use_datanode_hostname" : "false", "HDFS_CONF_dfs_datanode_use_datanode_ip_hostname" : "false", "HDFS_CONF_dfs_datanode_use_datanode_hostname" : "false", "SPARK_MASTER" : "spark://sparkmaster-sparkmaster-hdfssparkstandalone.marathon-user.containerip.dcos.thisdcos.directory:7077", "YARN_CONF_yarn_nodemanager_resource_memory_mb" : "4096", "YARN_CONF_yarn_nodemanager_resourcecpuvcores" : "1" } } ] } ] }

2018-01-24 12:47 GMT+01:00 Ivan Ermilov notifications@github.com:

Hi @Brandonage https://github.com/brandonage ! Which version/branch are you using? please provide minimal docker-compose which recreate your issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/big-data-europe/docker-spark/issues/22#issuecomment-360106380, or mute the thread https://github.com/notifications/unsubscribe-auth/ALd8pdE7vabeiXZH_43DXy-tBd8Qe8Itks5tNxg5gaJpZM4Rp0my .

earthquakesan commented 6 years ago

fixing the problem now

earthquakesan commented 6 years ago

@Brandonage I fixed it on the branch you are using, you can remove images, pull and that should work for you now.