Closed sunileman closed 6 years ago
I found nomad does not support 0 for memory limit (ie docker without memory tag). It seems running image using docker without defining memory tag works fine. With nomad it seems to be not possible. Nomad will impose hard memory limit.
I say this because if I run image like this: docker run -it sunileman/nifi1.1.0 all is good. nifi comes up
If i run docker image like this: docker run -it --rm --memory="15m" --memory-swappiness=-1 --cpu-shares="20" -p 8080:8080 sunileman/nifi1.1.0
instance will fail to start. How on nomad to run docker image without memory tag. I assume someone may respond that is against the purpose of nomad, as a resource negotiator. I understand that perspective as well.
hi @sunileman I assume the "Nomad 1.0.0" is a typo, as the current version is only at 0.7.0.
Also, Nomad doesn't allow "no limit" configuration. There are many issues open around that discussion about allowing "over provisioning" of resources.
HTH, Shantanu
@shantanugadgil my bad, I meant 0.7.0. I was looking at consul version. For some reason many containers run well with "over provisioning". Setting a limit seems to get them in a spin. Might have something to do with JVM alloc.
Hi, thanks for reporting the issue.
Have you taken a look at the default restart stanza per job type for Nomad? See here: https://www.nomadproject.io/docs/job-specification/restart.html#restart-parameter-defaults. I believe the issue that you are experiencing that the job exits successfully, but Nomad continues to restart it?
The issue was related to over provisioning. Nomad does not support it. thank you @shantanugadgil
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Nomad version
1.0.0
Operating system and Environment details
Amazon linux
Issue
I am able to run my docker container on same host with no issue using: docker run sunileman/nifi1.1.0
Launching with nomad (consul client agent co-located) the task continues to restart with no clear information on issue.
Reproduction steps
on aws p2.xlarge instance with nomad & consul client. from nomad consul client node run job with supplied job confi
Nomad Server logs (if appropriate)
Nomad Client logs (if appropriate)
2017/11/26 23:17:00.778310 [DEBUG] client: driver event for alloc "68b73ef5-069f-6130-22df-b44c6654a57f": Downloading image sunileman/nifi1.1.0:latest 2017/11/26 23:17:00.949954 [DEBUG] driver.docker: docker pull sunileman/nifi1.1.0:latest succeeded 2017-11-26T23:17:00.950Z [DEBUG] plugin: starting plugin: path=/opt/nomad/bin/nomad args="[/opt/nomad/bin/nomad executor {"LogFile":"/opt/nomad/alloc/68b73ef5-069f-6130-22df-b44c6654a57f/try1/executor.out","LogLevel":"DEBUG"}]" 2017-11-26T23:17:00.951Z [DEBUG] plugin: waiting for RPC address: path=/opt/nomad/bin/nomad 2017-11-26T23:17:00.963Z [DEBUG] plugin.nomad: plugin address: timestamp=2017-11-26T23:17:00.963Z address=/tmp/plugin661915386 network=unix 2017/11/26 23:17:00.966408 [DEBUG] driver.docker: Setting default logging options to syslog and unix:///tmp/plugin854779409 2017/11/26 23:17:00.966429 [DEBUG] driver.docker: Using config for logging: {Type:syslog ConfigRaw:[] Config:map[syslog-address:unix:///tmp/plugin854779409]} 2017/11/26 23:17:00.966438 [DEBUG] driver.docker: using 12582912 bytes memory for try1 2017/11/26 23:17:00.966446 [DEBUG] driver.docker: using 20 cpu shares for try1 2017/11/26 23:17:00.966466 [DEBUG] driver.docker: binding directories []string{"/opt/nomad/alloc/68b73ef5-069f-6130-22df-b44c6654a57f/alloc:/alloc", "/opt/nomad/alloc/68b73ef5-069f-6130-22df-b44c6654a57f/try1/local:/local", "/opt/nomad/alloc/68b73ef5-069f-6130-22df-b44c6654a57f/try1/secrets:/secrets"} for try1 2017/11/26 23:17:00.966476 [DEBUG] driver.docker: networking mode not specified; defaulting to bridge 2017/11/26 23:17:00.966491 [DEBUG] driver.docker: allocated port 172.30.2.229:24039 -> 8080 (mapped) 2017/11/26 23:17:00.966501 [DEBUG] driver.docker: exposed port 8080 2017/11/26 23:17:00.966523 [DEBUG] driver.docker: setting container name to: try1-68b73ef5-069f-6130-22df-b44c6654a57f 2017/11/26 23:17:00.992357 [DEBUG] client: updated allocations at index 449 (total 2) (pulled 0) (filtered 2) 2017/11/26 23:17:00.992456 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 2) 2017/11/26 23:17:01.878949 [INFO] driver.docker: created container 976cc87adcac880b31a93e8c676ccf36f17edb62ae9296028a90da711ea3f748 2017/11/26 23:17:02.738910 [INFO] driver.docker: started container 976cc87adcac880b31a93e8c676ccf36f17edb62ae9296028a90da711ea3f748 2017/11/26 23:17:02.750654 [WARN] client: error fetching stats of task try1: stats collection hasn't started yet 2017/11/26 23:17:02.762916 [DEBUG] consul.sync: registered 1 services, 1 checks; deregistered 0 services, 0 checks 2017/11/26 23:17:03.680706 [DEBUG] client: updated allocations at index 450 (total 2) (pulled 0) (filtered 2) 2017/11/26 23:17:03.680808 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 2) 2017/11/26 23:17:08.011752 [DEBUG] driver.docker: error collecting stats from container 976cc87adcac880b31a93e8c676ccf36f17edb62ae9296028a90da711ea3f748: io: read/write on closed pipe 2017-11-26T23:17:08.012Z [DEBUG] plugin: plugin process exited: path=/opt/nomad/bin/nomad 2017/11/26 23:17:08.026709 [INFO] client: task "try1" for alloc "68b73ef5-069f-6130-22df-b44c6654a57f" completed successfully 2017/11/26 23:17:08.026730 [INFO] client: Restarting task "try1" for alloc "68b73ef5-069f-6130-22df-b44c6654a57f" in 17.714321425s 2017/11/26 23:17:08.047107 [DEBUG] consul.sync: registered 0 services, 0 checks; deregistered 1 services, 1 checks 2017/11/26 23:17:08.192378 [DEBUG] client: updated allocations at index 451 (total 2) (pulled 0) (filtered 2) 2017/11/26 23:17:08.192459 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 2) 2017/11/26 23:17:10.469287 [DEBUG] http: Request /v1/agent/health?type=client (144.219µs) 2017/11/26 23:17:20.470497 [DEBUG] http: Request /v1/agent/health?type=client (238.65µs) 2017/11/26 23:17:25.741634 [DEBUG] client: driver event for alloc "68b73ef5-069f-6130-22df-b44c6654a57f": Downloading image sunileman/nifi1.1.0:latest 2017/11/26 23:17:25.825717 [DEBUG] driver.docker: docker pull sunileman/nifi1.1.0:latest succeeded 2017-11-26T23:17:25.826Z [DEBUG] plugin: starting plugin: path=/opt/nomad/bin/nomad args="[/opt/nomad/bin/nomad executor {"LogFile":"/opt/nomad/alloc/68b73ef5-069f-6130-22df-b44c6654a57f/try1/executor.out","LogLevel":"DEBUG"}]" 2017-11-26T23:17:25.826Z [DEBUG] plugin: waiting for RPC address: path=/opt/nomad/bin/nomad 2017-11-26T23:17:25.839Z [DEBUG] plugin.nomad: plugin address: timestamp=2017-11-26T23:17:25.839Z address=/tmp/plugin283545805 network=unix 2017/11/26 23:17:25.841912 [DEBUG] driver.docker: Setting default logging options to syslog and unix:///tmp/plugin476816840 2017/11/26 23:17:25.841949 [DEBUG] driver.docker: Using config for logging: {Type:syslog ConfigRaw:[] Config:map[syslog-address:unix:///tmp/plugin476816840]} 2017/11/26 23:17:25.841962 [DEBUG] driver.docker: using 12582912 bytes memory for try1 2017/11/26 23:17:25.841967 [DEBUG] driver.docker: using 20 cpu shares for try1 2017/11/26 23:17:25.841985 [DEBUG] driver.docker: binding directories []string{"/opt/nomad/alloc/68b73ef5-069f-6130-22df-b44c6654a57f/alloc:/alloc", "/opt/nomad/alloc/68b73ef5-069f-6130-22df-b44c6654a57f/try1/local:/local", "/opt/nomad/alloc/68b73ef5-069f-6130-22df-b44c6654a57f/try1/secrets:/secrets"} for try1 2017/11/26 23:17:25.841995 [DEBUG] driver.docker: networking mode not specified; defaulting to bridge 2017/11/26 23:17:25.842012 [DEBUG] driver.docker: allocated port 172.30.2.229:24039 -> 8080 (mapped) 2017/11/26 23:17:25.842021 [DEBUG] driver.docker: exposed port 8080 2017/11/26 23:17:25.842045 [DEBUG] driver.docker: setting container name to: try1-68b73ef5-069f-6130-22df-b44c6654a57f 2017/11/26 23:17:25.899194 [INFO] driver.docker: created container f7314be8f2560b756fcd29baaa27a54669e75ee72cc9d23da659900ed99ccd0d 2017/11/26 23:17:25.996369 [DEBUG] client: updated allocations at index 453 (total 2) (pulled 0) (filtered 2) 2017/11/26 23:17:25.996463 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 2) 2017/11/26 23:17:26.153726 [INFO] driver.docker: started container f7314be8f2560b756fcd29baaa27a54669e75ee72cc9d23da659900ed99ccd0d 2017/11/26 23:17:26.165413 [WARN] client: error fetching stats of task try1: stats collection hasn't started yet 2017/11/26 23:17:26.171058 [DEBUG] consul.sync: registered 1 services, 1 checks; deregistered 0 services, 0 checks 2017/11/26 23:17:26.395392 [DEBUG] client: updated allocations at index 454 (total 2) (pulled 0) (filtered 2) 2017/11/26 23:17:26.395476 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 2) 2017/11/26 23:17:29.663142 [DEBUG] driver.docker: error collecting stats from container f7314be8f2560b756fcd29baaa27a54669e75ee72cc9d23da659900ed99ccd0d: io: read/write on closed pipe 2017-11-26T23:17:29.664Z [DEBUG] plugin: plugin process exited: path=/opt/nomad/bin/nomad 2017/11/26 23:17:29.676336 [INFO] client: task "try1" for alloc "68b73ef5-069f-6130-22df-b44c6654a57f" completed successfully 2017/11/26 23:17:29.676357 [INFO] client: Restarting task "try1" for alloc "68b73ef5-069f-6130-22df-b44c6654a57f" in 15.483850185s 2017/11/26 23:17:29.698980 [DEBUG] consul.sync: registered 0 services, 0 checks; deregistered 1 services, 1 checks 2017/11/26 23:17:29.792402 [DEBUG] client: updated allocations at index 455 (total 2) (pulled 0) (filtered 2)
Job file (if appropriate)
`# There can only be a single job definition per file.
Create a job with ID and Name 'example'
job "test1" {
Run the job in the global region, which is the default.
}`