viam-soleng / viam-docker-manager

Apache License 2.0
1 stars 2 forks source link

Long-running image pulls leads to multiple containers #8

Closed lukeschmitt-tr closed 6 months ago

lukeschmitt-tr commented 7 months ago

Downloading large images can be slow, sometimes taking longer than a minute. This seems to cause the manager to time out, restarting the run/compose command multiple times. Eventually the image is pulled and it stops retrying. With the image finally available, each attempt then launches a container.

I've tried this out a few times and it seems repeatable. For reference, the image was about 3GB.

Log:

3/05/24, 7:27:14.574 PM   info robot_server.process.docker-module_/home/robot/viam-docker-manager.StdOut   pexec/managed_process.go:244   \_ 2024-03-05T19:27:14.574-0600 INFO viam-docker docker_deploy/docker.go:137 Image <IMAGE> does not exist. Pulling...  
3/05/24, 7:27:14.574 PM   info robot_server.process.docker-module_/home/robot/viam-docker-manager.StdOut   pexec/managed_process.go:244   \_ 2024-03-05T19:27:14.574-0600 INFO viam-docker docker_deploy/docker.go:41 Starting Docker Manager Module v0.0.3  
3/05/24, 7:27:14.573 PM   error robot_server.rdk:component:sensor/docker-module   resource/graph_node.go:230   resource build error: rpc error: code = DeadlineExceeded desc = context deadline exceeded   resource rdk:component:sensor/docker-module   model viam-soleng:manage:docker  
3/05/24, 7:27:14.573 PM   warn robot_server   impl/resource_manager.go:659   resource rdk:component:sensor/docker-module timed out after 1m0s during reconfigure  
3/05/24, 7:26:14.573 PM   info robot_server.process.docker-module_/home/robot/viam-docker-manager.StdOut   pexec/managed_process.go:244   \_ 2024-03-05T19:26:14.573-0600 INFO viam-docker docker_deploy/docker.go:137 Image <IMAGE> does not exist. Pulling...  
3/05/24, 7:26:14.573 PM   info robot_server.process.docker-module_/home/robot/viam-docker-manager.StdOut   pexec/managed_process.go:244   \_ 2024-03-05T19:26:14.573-0600 INFO viam-docker docker_deploy/docker.go:41 Starting Docker Manager Module v0.0.3  
3/05/24, 7:26:14.568 PM   error robot_server.rdk:component:sensor/docker-module   resource/graph_node.go:230   resource build error: rpc error: code = DeadlineExceeded desc = context deadline exceeded   resource rdk:component:sensor/docker-module   model viam-soleng:manage:docker  
3/05/24, 7:26:14.568 PM   warn robot_server   impl/resource_manager.go:659   resource rdk:component:sensor/docker-module timed out after 1m0s during reconfigure  
3/05/24, 7:25:14.568 PM   info robot_server.process.docker-module_/home/robot/viam-docker-manager.StdOut   pexec/managed_process.go:244   \_ 2024-03-05T19:25:14.568-0600 INFO viam-docker docker_deploy/docker.go:137 Image <IMAGE> does not exist. Pulling...  
3/05/24, 7:25:14.568 PM   info robot_server.process.docker-module_/home/robot/viam-docker-manager.StdOut   pexec/managed_process.go:244   \_ 2024-03-05T19:25:14.568-0600 INFO viam-docker docker_deploy/docker.go:41 Starting Docker Manager Module v0.0.3  

Created containers:

$ docker ps
CONTAINER ID   IMAGE     COMMAND     CREATED          STATUS          PORTS     NAMES
03523c797ef7   <IMAGE>   <COMMAND>   15 minutes ago   Up 15 minutes             loving_curran
d0acb64136e6   <IMAGE>   <COMMAND>   15 minutes ago   Up 15 minutes             admiring_zhukovsky
d9651f355aed   <IMAGE>   <COMMAND>   15 minutes ago   Up 15 minutes             gifted_rubin