NVIDIA / cloud-native-stack

Run cloud native workloads on NVIDIA GPUs
Apache License 2.0
119 stars 47 forks source link

sudo systemctl restart docker failed #1

Closed beyondli closed 3 years ago

beyondli commented 3 years ago

hi: env: xavier, JP4.4

error is : image

when I run sudo systemctl restart docker , it failed, if I set "default-runtime" : "nvidia" at daemon.json but if I remove the "default-runtime" : "nvidia" at daemon.json, no error occurs. thanks for any help

moconnor725 commented 3 years ago

+container-dev

From: beyondli notifications@github.com Reply-To: NVIDIA/egx-platform reply@reply.github.com Date: Thursday, July 30, 2020 at 8:47 AM To: NVIDIA/egx-platform egx-platform@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [NVIDIA/egx-platform] sudo systemctl restart docker failed (#1)

hi: env: xavier, JP4.4

when I run sudo systemctl restart docker , it failed, if I set "default-runtime" : "nvidia" at daemon.json error is : -- The start-up result is RESULT. 7月 30 22:48:41 xavier systemd[1]: docker.service: Start request repeated too quickly. 7月 30 22:48:41 xavier systemd[1]: docker.service: Failed with result 'exit-code'. 7月 30 22:48:41 xavier systemd[1]: Failed to start Docker Application Container Engine. -- Subject: Unit docker.service has failed -- Defined-By: systemd -- Support: http://www.ubuntu.com/support -- Unit docker.service has failed.

-- The result is RESULT. 7月 30 22:48:41 xavier systemd[1]: docker.socket: Failed with result 'service-start-limit-hit'. 7月 30 22:48:42 xavier sudo[9832]: xavier : TTY=pts/0 ; PWD=/home/xavier ; USER=root ; COMMAND=/bin/nano /etc/docker/daemon.json 7月 30 22:48:42 xavier sudo[9832]: pam_unix(sudo:session): session opened for user root by xavier(uid=0) 7月 30 22:48:45 xavier dhcpd[6767]: DHCPREQUEST for 192.168.55.100 from 8a:60:37:ac:f2:44 (menli-lt) via l4tbr0 7月 30 22:48:45 xavier dhcpd[6767]: DHCPACK on 192.168.55.100 to 8a:60:37:ac:f2:44 (menli-lt) via l4tbr0 7月 30 22:48:52 xavier dhcpd[6767]: DHCPREQUEST for 192.168.55.100 from 8a:60:37:ac:f2:44 (menli-lt) via l4tbr0

but if I remove the "default-runtime" : "nvidia" at daemon.json, no error occurs. thanks for any help

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/egx-platform/issues/1, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAH43ETSEMJ6SRC2UPZDZUDR6GEULANCNFSM4PN7DB6Q.

beyondli commented 3 years ago

container-dev

hi moconnor725 , thanks for your reply!

I try :sudo apt install container-dev, but can't find this command. and I searched it on web, no more results can hint me thanks for any help!

beyondli commented 3 years ago

container-dev

hi moconnor725 , thanks for your reply!

I try :sudo apt install container-dev, but can't find this command. and I searched it on web, no more results can hint me thanks for any help!

hi moconnor725 I found the error, I missed the ',' before "default-runtime" : "nvidia". now it can run OK, thanks!

moconnor725 commented 3 years ago

Just looked at the issue on github and it appears the issue has been resolved. Can someone with admin access on the repo, close it please.


Von: Michael O'Connor (Deep Learning) michaelo@nvidia.com Gesendet: Donnerstag, 30. Juli 2020 17:51 An: NVIDIA/egx-platform reply@reply.github.com; NVIDIA/egx-platform egx-platform@noreply.github.com; container-dev container-dev@exchange.nvidia.com Cc: Subscribed subscribed@noreply.github.com Betreff: Re: [NVIDIA/egx-platform] sudo systemctl restart docker failed (#1)

+container-dev

From: beyondli notifications@github.com Reply-To: NVIDIA/egx-platform reply@reply.github.com Date: Thursday, July 30, 2020 at 8:47 AM To: NVIDIA/egx-platform egx-platform@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [NVIDIA/egx-platform] sudo systemctl restart docker failed (#1)

hi: env: xavier, JP4.4

when I run sudo systemctl restart docker , it failed, if I set "default-runtime" : "nvidia" at daemon.json error is : -- The start-up result is RESULT. 7月 30 22:48:41 xavier systemd[1]: docker.service: Start request repeated too quickly. 7月 30 22:48:41 xavier systemd[1]: docker.service: Failed with result 'exit-code'. 7月 30 22:48:41 xavier systemd[1]: Failed to start Docker Application Container Engine. -- Subject: Unit docker.service has failed -- Defined-By: systemd -- Support: http://www.ubuntu.com/support -- Unit docker.service has failed.

-- The result is RESULT. 7月 30 22:48:41 xavier systemd[1]: docker.socket: Failed with result 'service-start-limit-hit'. 7月 30 22:48:42 xavier sudo[9832]: xavier : TTY=pts/0 ; PWD=/home/xavier ; USER=root ; COMMAND=/bin/nano /etc/docker/daemon.json 7月 30 22:48:42 xavier sudo[9832]: pam_unix(sudo:session): session opened for user root by xavier(uid=0) 7月 30 22:48:45 xavier dhcpd[6767]: DHCPREQUEST for 192.168.55.100 from 8a:60:37:ac:f2:44 (menli-lt) via l4tbr0 7月 30 22:48:45 xavier dhcpd[6767]: DHCPACK on 192.168.55.100 to 8a:60:37:ac:f2:44 (menli-lt) via l4tbr0 7月 30 22:48:52 xavier dhcpd[6767]: DHCPREQUEST for 192.168.55.100 from 8a:60:37:ac:f2:44 (menli-lt) via l4tbr0

but if I remove the "default-runtime" : "nvidia" at daemon.json, no error occurs. thanks for any help

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/egx-platform/issues/1, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAH43ETSEMJ6SRC2UPZDZUDR6GEULANCNFSM4PN7DB6Q.

angudadevops commented 3 years ago

@beyondli Thanks for the confirmation. Closing the issue.