hpcng / nomad-driver-singularity

HashiCorp Nomad driver plugin - Singularity
Mozilla Public License 2.0
25 stars 8 forks source link

[Error] Could not get exit code for failed program #15

Open ArangoGutierrez opened 5 years ago

ArangoGutierrez commented 5 years ago

@bilke reported an error on slack

Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.196+0200 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo" @module=logmon path="/var/nomad/alloc/f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5/alloc/logs/.moo moo.stdout.fifo" timestamp=2019-04-24T15:40:15.195+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.196+0200 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo" @module=logmon path="/var/nomad/alloc/f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5/alloc/logs/.moo moo.stderr.fifo" timestamp=2019-04-24T15:40:15.196+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.210+0200 [INFO ] client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" @module=singularity timestamp=2019-04-24T15:40:15.210+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.211+0200 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity="[-d -v run library://sylabsed/examples/lolcow:latest]" timestamp=2019-04-24T15:40:15.210+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.215+0200 [INFO ] client.alloc_runner.task_runner: failed to start task because plugin shutdown unexpectedly; attempting to recover: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.215+0200 [WARN ] client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.230+0200 [INFO ] client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity @module=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" timestamp=2019-04-24T15:40:15.229+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.230+0200 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity="[-d -v run library://sylabsed/examples/lolcow:latest]" timestamp=2019-04-24T15:40:15.230+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.234+0200 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo" error="failed to start task after driver exited unexpectedly: plugin is shut down"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.234+0200 [WARN ] client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.237+0200 [INFO ] client.alloc_runner.task_runner: not restarting task: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo" reason="Error was unrecoverable"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.243+0200 [INFO ] client.gc: marking allocation for GC: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5
Apr 24 15:40:23 singularity1 nomad[789]:     2019-04-24T15:40:23.017+0200 [WARN ] client.host_stats: error fetching host disk usage stats: error="no such file or directory" partition=/var/nomad/alloc/9bd114e0-e5e3-d571-1dd9-f075e77d7b7e/moo\040moo/alloc
Apr 24 15:40:24 singularity1 nomad[789]:     2019-04-24T15:40:24.700+0200 [INFO ] client: node registration complete
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.091+0200 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" @module=logmon path="/var/nomad/alloc/9bd114e0-e5e3-d571-1dd9-f075e77d7b7e/alloc/logs/.moo moo.stdout.fifo" timestamp=2019-04-24T15:40:27.091+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.091+0200 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" path="/var/nomad/alloc/9bd114e0-e5e3-d571-1dd9-f075e77d7b7e/alloc/logs/.moo moo.stderr.fifo" @module=logmon timestamp=2019-04-24T15:40:27.091+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.112+0200 [INFO ] client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" @module=singularity timestamp=2019-04-24T15:40:27.112+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.113+0200 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity="[-d -v run library://sylabsed/examples/lolcow:latest]" timestamp=2019-04-24T15:40:27.112+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.116+0200 [INFO ] client.alloc_runner.task_runner: failed to start task because plugin shutdown unexpectedly; attempting to recover: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.117+0200 [WARN ] client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.133+0200 [INFO ] client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity @module=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" timestamp=2019-04-24T15:40:27.133+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.133+0200 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity="[-d -v run library://sylabsed/examples/lolcow:latest]" timestamp=2019-04-24T15:40:27.133+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.137+0200 [WARN ] client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.137+0200 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" error="failed to start task after driver exited unexpectedly: plugin is shut down"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.140+0200 [INFO ] client.alloc_runner.task_runner: not restarting task: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" reason="Error was unrecoverable"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.146+0200 [INFO ] client.gc: marking allocation for GC: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.150+0200 [ERROR] client.alloc_runner.task_runner.task_hook.logmon.nomad: reading plugin stderr: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" error="read |0: file already closed"
Apr 24 15:40:35 singularity1 nomad[789]:     2019-04-24T15:40:35.705+0200 [INFO ] client: node registration complete
onlyjob commented 4 years ago

With Nomad 0.9.6, Singularity 3.4.1 and nomad-sriver-singularity built from HEAD of master I'm getting the following:

client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"

How is driver fingerprinting work?

preachermanx commented 4 years ago

I also am having a similar issue.

onlyjob commented 4 years ago

Could be caused by #31.