marvel-nccr / ansible-role-aiida

An ansible role that installs and configures AiiDA on Ubuntu.
Other
2 stars 5 forks source link

🧪 TEST: Start molecule container with systemd as PID1 #62

Closed chrisjsewell closed 3 years ago

chrisjsewell commented 3 years ago

I didn't realise before, but the geerlinguy docker images are set up to start by default with systemd. This allows for the tests to run in an environment more closely representing that of the Quantum Mobile VM, and won't skip all the systemd tasks.

root@instance:/# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 03:27 ?        00:00:01 /lib/systemd/systemd
root        21     1  0 03:27 ?        00:00:00 /lib/systemd/systemd-journald
62583       31     1  0 03:27 ?        00:00:00 /lib/systemd/systemd-timesyncd
systemd+    32     1  0 03:27 ?        00:00:00 /lib/systemd/systemd-resolved
syslog      35     1  0 03:27 ?        00:00:00 /usr/sbin/rsyslogd -n
root        42     1  0 03:27 tty2     00:00:00 /sbin/agetty -o -p -- \u --noclear tty2 linux
root        43     1  0 03:27 tty1     00:00:00 /sbin/agetty -o -p -- \u --noclear tty1 linux
root        44     1  0 03:27 tty3     00:00:00 /sbin/agetty -o -p -- \u --noclear tty3 linux
root        45     1  0 03:27 tty4     00:00:00 /sbin/agetty -o -p -- \u --noclear tty4 linux
root        46     1  0 03:27 tty5     00:00:00 /sbin/agetty -o -p -- \u --noclear tty5 linux
root        47     1  0 03:27 tty6     00:00:00 /sbin/agetty -o -p -- \u --noclear tty6 linux
postgres  4296     1  0 03:29 ?        00:00:00 /usr/lib/postgresql/10/bin/postgres -D /var/lib/postgresql/10/main -c config_file=/e
postgres  4298  4296  0 03:29 ?        00:00:00 postgres: 10/main: checkpointer process   
postgres  4299  4296  0 03:29 ?        00:00:00 postgres: 10/main: writer process   
postgres  4300  4296  0 03:29 ?        00:00:00 postgres: 10/main: wal writer process   
postgres  4301  4296  0 03:29 ?        00:00:00 postgres: 10/main: autovacuum launcher process   
postgres  4302  4296  0 03:29 ?        00:00:00 postgres: 10/main: stats collector process   
postgres  4303  4296  0 03:29 ?        00:00:00 postgres: 10/main: bgworker: logical replication launcher   
rabbitmq  4430     1  0 03:30 ?        00:00:00 /bin/sh /usr/sbin/rabbitmq-server
rabbitmq  4434  4430  1 03:30 ?        00:00:13 /usr/lib/erlang/erts-9.2/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdb
rabbitmq  4510     1  0 03:30 ?        00:00:00 /usr/lib/erlang/erts-9.2/bin/epmd -daemon
rabbitmq  4658  4434  0 03:30 ?        00:00:00 erl_child_setup 65536
rabbitmq  4684  4658  0 03:30 ?        00:00:00 inet_gethost 4
rabbitmq  4685  4684  0 03:30 ?        00:00:00 inet_gethost 4
root      7494     0  0 03:35 ?        00:00:00 /bin/sh
root      7586     0  0 03:35 ?        00:00:00 /root/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/node /tmp/vscode-r
root      7709  7586  0 03:35 ?        00:00:00 sh /root/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/server.sh --dis
root      7711  7709  1 03:35 ?        00:00:06 /root/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/node /root/.vscode
root      7744     0  0 03:35 ?        00:00:02 /root/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/node -e  ????const
root      7767     0  0 03:35 ?        00:00:01 /root/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/node -e  ????const
root      7783  7711  2 03:35 ?        00:00:08 /root/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/node /root/.vscode
root      7802  7711 56 03:35 ?        00:03:58 /root/.vscode-server/bin/d2e414d9e4239a252d1ab117bd7067f125afd80a/node /root/.vscode
root      7857  7783  0 03:35 pts/0    00:00:00 /bin/bash
root     12591     1  0 03:39 ?        00:00:00 /root/.virtualenvs/aiida/bin/python3.7 /root/.virtualenvs/aiida/bin/verdi -p name-wi
root     12608 12591  1 03:39 ?        00:00:02 /root/.virtualenvs/aiida/bin/python3.7 /root/.virtualenvs/aiida/bin/verdi -p name-wi
root     12611 12591  4 03:39 ?        00:00:06 /root/.virtualenvs/aiida/bin/python3.7 -c from circus import stats; stats.main() --e
postgres 12634  4296  0 03:39 ?        00:00:00 postgres: 10/main: aiida aiidadb 127.0.0.1(54878) idle
root     14193     0  0 03:41 ?        00:00:00 /bin/sh -c # Watch installed extensions ???trap "exit 0" 16 ???old=`ls -A --full-tim
root     14201     0  0 03:41 ?        00:00:00 /bin/sh -c # Watch machine settings ???trap "exit 0" 16 ???old=`ls -A --full-time se
root     14586 14201  0 03:42 ?        00:00:00 sleep 1
root     14590 14193  0 03:42 ?        00:00:00 sleep 1
root     14600  7857  0 03:42 pts/0    00:00:00 ps -ef
root@instance:/# systemctl
UNIT                                  LOAD   ACTIVE     SUB       DESCRIPTION                                                      
dev-vda1.device                       loaded activating tentative /dev/vda1                                                        
-.mount                               loaded active     mounted   Root Mount                                                       
dev-hugepages.mount                   loaded active     mounted   Huge Pages File System                                           
dev-mqueue.mount                      loaded active     mounted   POSIX Message Queue File System                                  
etc-hostname.mount                    loaded active     mounted   /etc/hostname                                                    
etc-hosts.mount                       loaded active     mounted   /etc/hosts                                                       
etc-resolv.conf.mount                 loaded active     mounted   /etc/resolv.conf                                                 
sys-fs-fuse-connections.mount         loaded active     mounted   FUSE Control File System                                         
sys-kernel-debug.mount                loaded active     mounted   Kernel Debug File System                                         
tmp.mount                             loaded active     mounted   /tmp                                                             
cron-update.path                      loaded active     waiting   systemd-cron path monitor                                        
systemd-ask-password-console.path     loaded active     waiting   Dispatch Password Requests to Console Directory Watch            
systemd-ask-password-wall.path        loaded active     waiting   Forward Password Requests to Wall Directory Watch                
init.scope                            loaded active     running   System and Service Manager                                       
aiida-daemon@name-with-dashes.service loaded active     running   AiiDA daemon service for profile name-with-dashes                
getty-static.service                  loaded active     exited    getty on tty2-tty6 if dbus and logind are not available          
getty@tty1.service                    loaded active     running   Getty on tty1                                                    
getty@tty2.service                    loaded active     running   Getty on tty2                                                    
getty@tty3.service                    loaded active     running   Getty on tty3                                                    
getty@tty4.service                    loaded active     running   Getty on tty4                                                    
getty@tty5.service                    loaded active     running   Getty on tty5                                                    
getty@tty6.service                    loaded active     running   Getty on tty6                                                    
postgresql.service                    loaded active     exited    PostgreSQL RDBMS                                                 
postgresql@10-main.service            loaded active     running   PostgreSQL Cluster 10-main                                       
rabbitmq-server.service               loaded active     running   RabbitMQ Messaging Server                                        
rsyslog.service                       loaded active     running   System Logging Service                                           
systemd-journal-flush.service         loaded active     exited    Flush Journal to Persistent Storage                              
lines 1-28...skipping...
UNIT                                  LOAD   ACTIVE     SUB       DESCRIPTION                                                      
dev-vda1.device                       loaded activating tentative /dev/vda1                                                        
-.mount                               loaded active     mounted   Root Mount                                                       
dev-hugepages.mount                   loaded active     mounted   Huge Pages File System                                           
dev-mqueue.mount                      loaded active     mounted   POSIX Message Queue File System                                  
etc-hostname.mount                    loaded active     mounted   /etc/hostname                                                    
etc-hosts.mount                       loaded active     mounted   /etc/hosts                                                       
etc-resolv.conf.mount                 loaded active     mounted   /etc/resolv.conf                                                 
sys-fs-fuse-connections.mount         loaded active     mounted   FUSE Control File System                                         
sys-kernel-debug.mount                loaded active     mounted   Kernel Debug File System                                         
tmp.mount                             loaded active     mounted   /tmp                                                             
cron-update.path                      loaded active     waiting   systemd-cron path monitor                                        
systemd-ask-password-console.path     loaded active     waiting   Dispatch Password Requests to Console Directory Watch            
systemd-ask-password-wall.path        loaded active     waiting   Forward Password Requests to Wall Directory Watch                
init.scope                            loaded active     running   System and Service Manager                                       
aiida-daemon@name-with-dashes.service loaded active     running   AiiDA daemon service for profile name-with-dashes                
getty-static.service                  loaded active     exited    getty on tty2-tty6 if dbus and logind are not available          
getty@tty1.service                    loaded active     running   Getty on tty1                                                    
getty@tty2.service                    loaded active     running   Getty on tty2                                                    
getty@tty3.service                    loaded active     running   Getty on tty3                                                    
getty@tty4.service                    loaded active     running   Getty on tty4                                                    
getty@tty5.service                    loaded active     running   Getty on tty5                                                    
getty@tty6.service                    loaded active     running   Getty on tty6                                                    
postgresql.service                    loaded active     exited    PostgreSQL RDBMS                                                 
postgresql@10-main.service            loaded active     running   PostgreSQL Cluster 10-main                                       
rabbitmq-server.service               loaded active     running   RabbitMQ Messaging Server                                        
rsyslog.service                       loaded active     running   System Logging Service                                           
systemd-journal-flush.service         loaded active     exited    Flush Journal to Persistent Storage                              
systemd-journald.service              loaded active     running   Journal Service                                                  
systemd-modules-load.service          loaded active     exited    Load Kernel Modules                                              
systemd-random-seed.service           loaded active     exited    Load/Save Random Seed                                            
systemd-remount-fs.service            loaded active     exited    Remount Root and Kernel File Systems                             
systemd-resolved.service              loaded active     running   Network Name Resolution                                          
systemd-sysctl.service                loaded active     exited    Apply Kernel Variables                                           
systemd-timesyncd.service             loaded active     running   Network Time Synchronization                                     
systemd-tmpfiles-setup-dev.service    loaded active     exited    Create Static Device Nodes in /dev                               
systemd-tmpfiles-setup.service        loaded active     exited    Create Volatile Files and Directories                            
systemd-update-utmp.service           loaded active     exited    Update UTMP about System Boot/Shutdown                           
systemd-user-sessions.service         loaded active     exited    Permit User Sessions                                             
-.slice                               loaded active     active    Root Slice                                                       
system-aiida\x2ddaemon.slice          loaded active     active    system-aiida\x2ddaemon.slice                                     
system-getty.slice                    loaded active     active    system-getty.slice                                               
system-postgresql.slice               loaded active     active    system-postgresql.slice                                          
system.slice                          loaded active     active    System Slice                                                     
syslog.socket                         loaded active     running   Syslog Socket                                                    
systemd-initctl.socket                loaded active     listening /dev/initctl Compatibility Named Pipe                            
systemd-journald-audit.socket         loaded active     running   Journal Audit Socket                                             
systemd-journald-dev-log.socket       loaded active     running   Journal Socket (/dev/log)                                        
systemd-journald.socket               loaded active     running   Journal Socket                                                   
swap.swap                             loaded active     active    /swap                                                            
basic.target                          loaded active     active    Basic System                                                     
cron.target                           loaded active     active    systemd-cron                                                     
cryptsetup.target                     loaded active     active    Local Encrypted Volumes                                          
getty.target                          loaded active     active    Login Prompts                                                    
graphical.target                      loaded active     active    Graphical Interface                                              
local-fs-pre.target                   loaded active     active    Local File Systems (Pre)                                         
local-fs.target                       loaded active     active    Local File Systems                                               
multi-user.target                     loaded active     active    Multi-User System                                                
network.target                        loaded active     active    Network                                                          
nss-lookup.target                     loaded active     active    Host and Network Name Lookups                                    
paths.target                          loaded active     active    Paths                                                            
remote-fs.target                      loaded active     active    Remote File Systems                                              
slices.target                         loaded active     active    Slices                                                           
sockets.target                        loaded active     active    Sockets                                                          
swap.target                           loaded active     active    Swap                                                             
sysinit.target                        loaded active     active    System Initialization                                            
time-sync.target                      loaded active     active    System Time Synchronized                                         
timers.target                         loaded active     active    Timers                                                           
apt-daily-upgrade.timer               loaded active     waiting   Daily apt upgrade and clean activities                           
apt-daily.timer                       loaded active     waiting   Daily apt download activities                                    
cron-daily.timer                      loaded active     waiting   systemd-cron daily timer                                         
cron-hourly.timer                     loaded active     waiting   systemd-cron hourly timer                                        
cron-monthly.timer                    loaded active     waiting   systemd-cron monthly timer                                       
cron-sysstat-root-0.timer             loaded active     waiting   [Timer] "5-55/10 * * * * root command -v debian-sa1 > /dev/null &&
cron-sysstat-root-1.timer             loaded active     waiting   [Timer] "59 23 * * * root command -v debian-sa1 > /dev/null && deb
cron-weekly.timer                     loaded active     waiting   systemd-cron weekly timer                                        
fstrim.timer                          loaded active     waiting   Discard unused blocks once a week                                
motd-news.timer                       loaded active     waiting   Message of the Day                                               
systemd-tmpfiles-clean.timer          loaded active     waiting   Daily Cleanup of Temporary Directories                           

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

78 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
chrisjsewell commented 3 years ago

It appears that on redhat the database cluster always needs to be created. On CentOS8:

[root@instance /]# systemctl status postgresql.service
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sat 2020-12-05 02:54:43 UTC; 48s ago
  Process: 1291 ExecStartPre=/usr/libexec/postgresql-check-db-dir postgresql (code=exited, status=1/FAILURE)

Dec 05 02:54:43 instance systemd[1]: Starting PostgreSQL database server...
Dec 05 02:54:43 instance systemd[1]: postgresql.service: Control process exited, code=exited status=1
Dec 05 02:54:43 instance systemd[1]: postgresql.service: Failed with result 'exit-code'.
Dec 05 02:54:43 instance systemd[1]: Failed to start PostgreSQL database server.
[root@instance /]# journalctl -xe
-- Unit session-c14.scope has finished starting up.
-- 
-- The start-up result is RESULT.
Dec 05 02:54:42 instance sudo[1269]: pam_unix(sudo:session): session opened for user root by (uid=0)
Dec 05 02:54:42 instance ansible-ansible.legacy.systemd[1271]: Invoked with name=postgresql state=started enabled=True daem>
Dec 05 02:54:42 instance systemd[1]: Reloading.
Dec 05 02:54:43 instance systemd[1]: Starting PostgreSQL database server...
-- Subject: Unit postgresql.service has begun start-up
-- Defined-By: systemd
-- Support: https://access.redhat.com/support
-- 
-- Unit postgresql.service has begun starting up.
Dec 05 02:54:43 instance postgresql-check-db-dir[1291]: Directory "/var/lib/pgsql/data" is missing or empty.
Dec 05 02:54:43 instance postgresql-check-db-dir[1291]: Use "/usr/bin/postgresql-setup --initdb"
Dec 05 02:54:43 instance postgresql-check-db-dir[1291]: to initialize the database cluster.
Dec 05 02:54:43 instance postgresql-check-db-dir[1291]: See /usr/share/doc/postgresql/README.rpm-dist for more information.
Dec 05 02:54:43 instance systemd[1]: postgresql.service: Control process exited, code=exited status=1
Dec 05 02:54:43 instance systemd[1]: postgresql.service: Failed with result 'exit-code'.
Dec 05 02:54:43 instance systemd[1]: Failed to start PostgreSQL database server.
-- Subject: Unit postgresql.service has failed
-- Defined-By: systemd
-- Support: https://access.redhat.com/support
-- 
-- Unit postgresql.service has failed.
-- 
-- The result is RESULT.
Dec 05 02:54:43 instance sudo[1269]: pam_unix(sudo:session): session closed for user root
Dec 05 02:55:10 instance systemd[1]: systemd-logind.service: Got notification message from PID 59, but reception is disable>
espenfl commented 3 years ago

I didn't realise before, but the geerlinguy docker images are set up to start by default with systemd. This allows for the tests to run in an environment more closely representing that of the Quantum Mobile VM, and won't skip all the systemd tasks.

Yes. This is a bit outside the PR, but I will ask anyway. Do we have an official AiIDA stance on systemd or not? I am working on the scheduler containers and made an effort to stay away from systemd in the AiiDA context. Obviously that is not always straightforward, but keeps it completely isolated.

chrisjsewell commented 3 years ago

Do we have an official AiIDA stance on systemd or not?

Well, in terms of Quantum Mobile, my stance is that, if its good enough for geerlingguy (the godfather of ansible lol) then its good enough for me lol. Plus, to my knowledge, it's the mostly widely used service manager on Linux. Are there any drawbacks of systemd that you know of?

I would tangentially note though that, in the context of (Docker) containers, really the idea is to only have one service per container, so then you should not need a service manager because essentially the docker daemon itself is the service manager, i.e. managing the container lifespan. (In this PR the containers are principally being used as a representation of a VM for testing, rather than for actual production use.)

On a semi-related note Podman is actually a lot better at integrating with systemd; both inside the container (https://developers.redhat.com/blog/2019/04/24/how-to-run-systemd-in-a-container/) and for systemd managing the actual containers as services (https://www.redhat.com/sysadmin/podman-shareable-systemd-services)

I am working on the scheduler containers

I'm not sure that I've heard of this work. Out of interest, what are the goal of these, and are they located in a repository somewhere?

espenfl commented 3 years ago

Well, in terms of Quantum Mobile, my stance is that, if its good enough for geerlingguy (the godfather of ansible lol) then its good enough for me lol. Plus, to my knowledge, it's the mostly widely used service manager on Linux. Are there any drawbacks of systemd that you know of?

Pros and cons with everything, but I was referring mainly to the idea behind containers and the way they are done in Docker. What you also refer to. One can discuss forever if this is a good principle to pin, but I think we should be pragmatic about it and hence my question I thus understand from you that we do not have a particular opinion on systemd or not in our service containers and that we indeed want to follow a pragmatic approach. For most service containers allowing systemd surely simplifies implementation.

Let us take the scheduler discussions on Slack.