bsc-wdc / compss

COMP Superscalar (COMPSs) is a framework which aims to ease the development and execution of applications for distributed infrastructures, such as Clusters, Grids and Clouds.
https://compss.bsc.es
Apache License 2.0
46 stars 21 forks source link

No start_daemon in the Docker cli? #9

Open kinow opened 7 months ago

kinow commented 7 months ago

Level

MINOR

Component

PYTHON BINDING

https://github.com/bsc-wdc/compss/blob/172d245da563a63ee45de4b805100774866d92b0/builders/specs/cli/PyCOMPSsCLIResources/pycompss_cli/core/docker/cmd.py#L230

Environment

$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"

$ pip list
Package            Version
------------------ ----------
certifi            2023.11.17
charset-normalizer 3.3.2
docker             7.0.0
idna               3.6
packaging          23.2
pip                23.3.2
pycompss-cli       3.3.2
requests           2.31.0
setuptools         69.0.3
urllib3            2.1.0
wheel              0.42.0

Description

For some strange reason the pycompss init command fails for me. I already have the compss-tutorial image 3.3. But when I launch it, it still stays open for a very long time.

So I decided to try a few things to see if I could get it to work, and at one point I think that made the code reach a function that calls start_daemon for Docker, but fails as that function does not exist on the docker/cli.py DockerCmd object (and the parent of DockerCmd is object, and I couldn't find any metaclass or something else that would define that function).

Minimal example to reproduce

  1. In an environment with pycompss-cli installed
  2. Run this:
$ pycompss init -n test docker -i compss/compss-tutorial:3.3
Environment created ID: test
Starting pycompss-master-test container in dir /home/bdepaula/Development/python/workspace/tutorial_apps
If this is your first time running PyCOMPSs it may take a while because it needs to download the docker image. Please be patient.

For me, this stays running forever. I had already downloaded compss/compss-tutorial:3.3, so not sure what it is doing (htop and ps didn't help, might strace it later if I don't find any other solution). But it does create the env, in another terminal:

$ pycompss env list
ID      | Type   | Active
======= | ====== | ======
default | local  | *     
test    | docker |    

Then if I change to test,

$ pycompss env change test
Environment `test` is now active

And try to start the monitor:

$ pycompss monitor start
Starting Monitor
Traceback (most recent call last):
  File "/home/bdepaula/mambaforge/envs/pycompss/bin/pycompss", line 8, in <module>
    sys.exit(main())
  File "/home/bdepaula/mambaforge/envs/pycompss/lib/python3.10/site-packages/pycompss_cli/cli/pycompss.py", line 45, in main
    ActionsDispatcher().run_action(arguments)
  File "/home/bdepaula/mambaforge/envs/pycompss/lib/python3.10/site-packages/pycompss_cli/core/actions_dispatcher.py", line 53, in run_action
    action_func()
  File "/home/bdepaula/mambaforge/envs/pycompss/lib/python3.10/site-packages/pycompss_cli/core/docker/actions.py", line 141, in monitor
    self.docker_cmd.docker_start_monitoring()
  File "/home/bdepaula/mambaforge/envs/pycompss/lib/python3.10/site-packages/pycompss_cli/core/docker/cmd.py", line 230, in docker_start_monitoring
    self.start_daemon()
AttributeError: 'DockerCmd' object has no attribute 'start_daemon'

Exception

AttributeError: 'DockerCmd' object has no attribute 'start_daemon'

Expected behaviour

The daemon is launched, I think.

kinow commented 7 months ago

For some strange reason the pycompss init command fails for me. I already have the compss-tutorial image 3.3. But when I launch it, it still stays open for a very long time.

It worked! Just took a really long time to finish (>1 hour) even though I already had downloaded the image for 3.3. I saw that when I did not specify the image, it used 3.2 by default. I wonder if it could be trying to use that image instead for some step... but anyway, feel free to ignore this part, but the start_daemon could be an issue in some other situations, I think.

jorgee commented 7 months ago

When did you downloaded the image? Was it today or last week?

kinow commented 7 months ago

First thing in the morning, before joining the Zoom meeting.

kinow commented 7 months ago

Today I am using my personal laptop, which is ~10 years older than BSC's, and normally slower. But the pycompss init -n test docker -i compss/compss-tutorial:3.3 finished much quicker.

I can see the container has been up and running for some time already.

(pycompss) kinow@ranma:~$ docker image ls
REPOSITORY               TAG       IMAGE ID       CREATED        SIZE
compss/compss-tutorial   3.3       c7299b31198c   3 days ago     3.1GB
...

kinow@ranma:~$ docker ps -a
CONTAINER ID   IMAGE                        COMMAND               CREATED             STATUS             PORTS                                                                                                           NAMES
8318db7a39ef   compss/compss-tutorial:3.3   "/usr/sbin/sshd -D"   About an hour ago   Up About an hour   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 22/tcp, 43000-44000/tcp, 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp   pycompss-master-docker-tutorial

I launched pycompss monitor start and it opened a webpage but failed to connect to http://localhost:8080/compss-monitor.

And I can see that the local ports/sockets were bound too:

kinow@ranma:~$ netstat -tlnp | grep -E "8888|8080"
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:8888            0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      -                   
tcp6       0      0 :::8888                 :::*                    LISTEN      -                   
tcp6       0      0 :::8080                 :::*                    LISTEN

Still, opening 8080 or 8888 doesn't work.

The container doesn't seem to have anything in its stdout for logs?

kinow@ranma:~$ docker logs 8318db7a39ef
kinow@ranma:~$ 

And opening a terminal in the container created by the pycompss-cli, it looks like there are no jupyter nor monitor running. And some weird files inside the root of the container as well.

kinow@ranma:~$ docker exec -ti 8318db7a39ef /bin/bash
root@8318db7a39ef:/# netstat -tlnp
bash: netstat: command not found
root@8318db7a39ef:/# ps 
    PID TTY          TIME CMD
     65 pts/0    00:00:00 bash
     75 pts/0    00:00:00 ps
root@8318db7a39ef:/# ls
'=0.24.2'  '=1.3.0'   dev    lib32    mnt           resources.xml   srv   var
'=0.8.0'   '=2.2.3'   etc    lib64    opt           root            sys
'=1.0.2'    bin       home   libx32   proc          run             tmp
'=1.1.5'    boot      lib    media    project.xml   sbin            usr
root@8318db7a39ef:/# 

And the Jupyter command fails:

(pycompss) kinow@ranma:~$ pycompss jupyter
Traceback (most recent call last):
  File "/home/kinow/mambaforge/envs/pycompss/bin/pycompss", line 8, in <module>
    sys.exit(main())
  File "/home/kinow/mambaforge/envs/pycompss/lib/python3.10/site-packages/pycompss_cli/cli/pycompss.py", line 45, in main
    ActionsDispatcher().run_action(arguments)
  File "/home/kinow/mambaforge/envs/pycompss/lib/python3.10/site-packages/pycompss_cli/core/actions_dispatcher.py", line 53, in run_action
    action_func()
  File "/home/kinow/mambaforge/envs/pycompss/lib/python3.10/site-packages/pycompss_cli/core/local/actions.py", line 111, in jupyter
    if jupyter_args[0] == 'lab':
IndexError: list index out of range