aind-containers / aind

AinD: Android in Docker. Ain't an emulator.
Apache License 2.0
1.49k stars 90 forks source link

Kubernetes manifest should not require hostNetwork #21

Closed AkihiroSuda closed 4 years ago

AkihiroSuda commented 4 years ago

The Kubernetes manifest added in https://github.com/aind-containers/aind/pull/8 requires hostNetwork: true for some unknown reason. Because of this, multiple aind pods cannot be launched on a single node.

Without hostNetwork: true, anbox session-manager crashes with random cryptic errors.

root@aind-6456bcc7cb-f748g:/# unsudo bash
+ car=bash                                                 
+ shift                                                                                                                + cdr=
++ which bash                                                                                                          
+ exec machinectl shell user@ /usr/bin/bash                                                                            
Connected to the local host. Press ^] three times within 1s to exit session.
user@aind-6456bcc7cb-f748g:~$ export DISPLAY=:0
user@aind-6456bcc7cb-f748g:~$ anbox session-manager
[ 2020-04-13 12:53:12] [client.cpp:49@start] Failed to start container: Failed to start container: Failed to start container
[ 2020-04-13 12:53:12] [session_manager.cpp:148@operator()] Lost connection to container manager, terminating.
[ 2020-04-13 12:53:12] [daemon.cpp:61@Run] Container is not running
[ 2020-04-13 12:53:12] [session_manager.cpp:148@operator()] Lost connection to container manager, terminating.
Stack trace (most recent call last) in thread 454:
#8    Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in 
#7    Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f577153a152, in clone
#6    Object "/lib/x86_64-linux-gnu/libpthread.so.0", at 0x7f57718aa608, in 
#5    Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7f57716fbc83, in 
#4    Object "anbox", at 0x5614afcd9e69, in 
#3    Object "anbox", at 0x5614afc8c9ac, in boost::asio::detail::scheduler::run(boost::system::error_code&)
#2    Object "anbox", at 0x5614afd008ad, in boost::asio::detail::reactive_socket_recv_op<boost::asio::mutable_buffers_1, std::function<void (boost::system::error_code const&, unsigned long)>, boost::asio::detail::io_object_executor<boost::asio::executor> >::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long)
#1    Object "anbox", at 0x5614afc9b40b, in anbox::container::Client::on_read_size(boost::system::error_code const&, unsigned long)
#0    Object "anbox", at 0x5614afc848a6, in 
Segmentation fault (Address not mapped to object [0x18])
Segmentation fault (core dumped)

This issue is specific to Kubernetes. docker run ... does not require --net=host.

AkihiroSuda commented 4 years ago

This is probably because /sys is accidentally mounted as RO: https://github.com/containerd/containerd/issues/3221