galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.37k stars 992 forks source link

On almalinux 9, running dev - gunicorn.sock is not created #16540

Closed maikenp closed 1 year ago

maikenp commented 1 year ago

Describe the bug

When using the ansible-galaxy playbook and following: https://training.galaxyproject.org/training-material/topics/admin/tutorials/ansible-galaxy/tutorial.html but setting

galaxy_commit_id: dev

instead of release_23

the gunicorn.sock is not created.

Aug 10 23:28:54 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 2023/08/10 23:28:54 [crit] 1010#1010: *386 connect() to unix:/storage/srv/galaxy/var/config/gunicorn.sock failed (2: No such file or directory) while connecting to upstream, client: 86.62.155.78, server: galaxy-arc-test.itf.uiocloud.no, request: "GET /api/entry_points?running=true HTTP/1.1", upstream: "http://unix:/storage/srv/galaxy/var/config/gunicorn.sock:/api/entry_points?running=true", host: "galaxy-arc-test.itf.uiocloud.no", referrer: "https://galaxy-arc-test.itf.uiocloud.no/admin"

If I instead install release_23 everything works as expected and gunicorn.sock is created. And I can use the galaxy server and submit jobs.

Galaxy Version and/or server at which you observed the bug dev branch of https://github.com/galaxyproject/galaxy

Installed on a test galaxy-server running almalinux 9. selinux is turned off for testing purposes.

To Reproduce Steps to reproduce the behavior:

  1. Follow the tutorial to set up a galaxy server: https://training.galaxyproject.org/training-material/topics/admin/tutorials/ansible-galaxy/tutorial.html
  2. In the group_vars/galaxyservers.yml set the galaxy_commit_id: dev
  3. Run the playbook
  4. Check messages in sys-log to see the error
  5. Galaxy server does not start properly - the web-site will not load

Expected behavior Galaxy server should start as normal.

mvdbeek commented 1 year ago

Can you look at the logs ? I don't know what the tutorial uses as service name, but systemctl should list the service Also for development this is way too complicated ./run.sh will work fine and be much easier to manage.

maikenp commented 1 year ago

Thanks, so the problem shows up in nginx, while galaxy-gunicorn seems to be ok:

[root@galaxy-arc-test ~]# systemctl status galaxy-gunicorn
● galaxy-gunicorn.service - Galaxy gunicorn
     Loaded: loaded (/etc/systemd/system/galaxy-gunicorn.service; disabled; preset: disabled)
     Active: active (running) since Fri 2023-08-11 09:56:35 CEST; 25s ago
   Main PID: 26656 (gunicorn)
      Tasks: 1 (limit: 50188)
     Memory: 216.4M
        CPU: 4.780s
     CGroup: /system.slice/galaxy-gunicorn.service
             └─26656 /storage/srv/galaxy/venv/bin/python3 /storage/srv/galaxy/venv/bin/gunicorn "galaxy.webapps.galaxy.fast_factory:factory()" --timeout 300 --pythonpath lib -k galaxy.webapps.galaxy.workers.Worke>

Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.base DEBUG 2023-08-11 09:56:41,277 [pN:main,p:26656,tN:MainThread] Loaded tool id: hello_arc, version: 1.0.0 into tool pa>
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.base DEBUG 2023-08-11 09:56:41,277 [pN:main,p:26656,tN:MainThread] Loading tool panel finished (0.206 ms)
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.base DEBUG 2023-08-11 09:56:41,279 [pN:main,p:26656,tN:MainThread] Loaded tool id: upload1, version: 1.1.7 into tool pane>
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.base DEBUG 2023-08-11 09:56:41,279 [pN:main,p:26656,tN:MainThread] Loaded tool id: hello_arc, version: 1.0.0 into tool pa>
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.views.edam DEBUG 2023-08-11 09:56:41,279 [pN:main,p:26656,tN:MainThread] Loading EDAM tool panel finished (1.032 ms)
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.base DEBUG 2023-08-11 09:56:41,279 [pN:main,p:26656,tN:MainThread] Loaded tool id: upload1, version: 1.1.7 into tool pane>
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.base DEBUG 2023-08-11 09:56:41,279 [pN:main,p:26656,tN:MainThread] Loaded tool id: hello_arc, version: 1.0.0 into tool pa>
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.views.edam DEBUG 2023-08-11 09:56:41,279 [pN:main,p:26656,tN:MainThread] Loading EDAM tool panel finished (0.836 ms)
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.toolbox.integrated_panel DEBUG 2023-08-11 09:56:41,280 [pN:main,p:26656,tN:MainThread] Writing integrated tool panel config file >
Aug 11 09:56:41 galaxy-arc-test.itf.uiocloud.no galaxyctl[26656]: galaxy.tool_util.deps DEBUG 2023-08-11 09:56:41,282 [pN:main,p:26656,tN:MainThread] Unable to find config file '/storage/srv/galaxy/config/depende>
[root@galaxy-arc-test ~]# systemctl status nginx
● nginx.service - The nginx HTTP and reverse proxy server
     Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; preset: disabled)
     Active: active (running) since Thu 2023-08-10 22:36:23 CEST; 11h ago
   Main PID: 1008 (nginx)
      Tasks: 3 (limit: 50188)
     Memory: 6.8M
        CPU: 638ms
     CGroup: /system.slice/nginx.service
             ├─1008 "nginx: master process /usr/sbin/nginx"
             ├─1009 "nginx: worker process"
             └─1010 "nginx: worker process"

Aug 11 09:56:49 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 2023/08/11 09:56:49 [crit] 1010#1010: *3166 connect() to unix:/storage/srv/galaxy/var/config/gunicorn.sock faile>
Aug 11 09:56:49 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 193.157.183.31 - - [11/Aug/2023:09:56:49 +0200] "GET /api/entry_points?running=true HTTP/1.1" 502 157 "https://g>
Aug 11 09:56:57 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 2023/08/11 09:56:57 [crit] 1010#1010: *3166 connect() to unix:/storage/srv/galaxy/var/config/gunicorn.sock faile>
Aug 11 09:56:57 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 193.157.183.31 - - [11/Aug/2023:09:56:57 +0200] "GET /history/current_history_json?since=2023-08-10T21:01:30.079>
Aug 11 09:56:58 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 2023/08/11 09:56:58 [crit] 1010#1010: *3166 connect() to unix:/storage/srv/galaxy/var/config/gunicorn.sock faile>
Aug 11 09:56:58 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 193.157.183.31 - - [11/Aug/2023:09:56:58 +0200] "GET /api/entry_points?running=true HTTP/1.1" 502 157 "https://g>
Aug 11 09:56:59 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 2023/08/11 09:56:59 [crit] 1010#1010: *3166 connect() to unix:/storage/srv/galaxy/var/config/gunicorn.sock faile>
Aug 11 09:56:59 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 193.157.183.31 - - [11/Aug/2023:09:56:59 +0200] "GET /api/entry_points?running=true HTTP/1.1" 502 157 "https://g>
Aug 11 09:56:59 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 2023/08/11 09:56:59 [crit] 1010#1010: *3166 connect() to unix:/storage/srv/galaxy/var/config/gunicorn.sock faile>
Aug 11 09:56:59 galaxy-arc-test.itf.uiocloud.no nginx[1010]: galaxy-arc-test.itf.uiocloud.no nginx: 193.157.183.31 - - [11/Aug/2023:09:56:59 +0200] "GET /api/entry_points?running=true HTTP/1.1" 502 157 "https://g>
[root@galaxy-arc-test ~]# /usr/local/bin/galaxyctl status
Dynamic handlers are configured in Gravity but Galaxy is not configured to assign jobs to handlers dynamically, so these handlers will not handle jobs. Set the job handler assignment method in the Galaxy job configuration to `db-skip-locked` or `db-transaction-isolation` to fix this.
  UNIT                       LOAD   ACTIVE SUB     DESCRIPTION               
  galaxy-celery-beat.service loaded active running Galaxy celery-beat
  galaxy-celery.service      loaded active running Galaxy celery
  galaxy-gunicorn.service    loaded active running Galaxy gunicorn
  galaxy-handler@0.service   loaded active running Galaxy handler (process 0)
  galaxy-handler@1.service   loaded active running Galaxy handler (process 1)
  galaxy.target              loaded active active  Galaxy

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
6 loaded units listed.
To show all installed unit files use 'systemctl list-unit-files'.

The list of all services running:

[root@galaxy-arc-test ~]# systemctl list-units --type=service
  UNIT                                                  LOAD   ACTIVE SUB     DESCRIPTION                                            
  auditd.service                                        loaded active running Security Auditing Service
  chronyd.service                                       loaded active running NTP client/server
  cloud-config.service                                  loaded active exited  Apply the settings specified in cloud-config
  cloud-final.service                                   loaded active exited  Execute cloud user/final scripts
  cloud-init-local.service                              loaded active exited  Initial cloud-init job (pre-networking)
  cloud-init.service                                    loaded active exited  Initial cloud-init job (metadata service crawler)
  crond.service                                         loaded active running Command Scheduler
  dbus-broker.service                                   loaded active running D-Bus System Message Bus
  dracut-shutdown.service                               loaded active exited  Restore /run/initramfs on shutdown
  galaxy-celery-beat.service                            loaded active running Galaxy celery-beat
  galaxy-celery.service                                 loaded active running Galaxy celery
  galaxy-gunicorn.service                               loaded active running Galaxy gunicorn
  galaxy-handler@0.service                              loaded active running Galaxy handler (process 0)
  galaxy-handler@1.service                              loaded active running Galaxy handler (process 1)
  getty@tty1.service                                    loaded active running Getty on tty1
  gssproxy.service                                      loaded active running GSSAPI Proxy Daemon
  irqbalance.service                                    loaded active running irqbalance daemon
  kmod-static-nodes.service                             loaded active exited  Create List of Static Device Nodes
  mariadb.service                                       loaded active running MariaDB 10.5 database server
  munge.service                                         loaded active running MUNGE authentication service
  NetworkManager-wait-online.service                    loaded active exited  Network Manager Wait Online
  NetworkManager.service                                loaded active running Network Manager
  nginx.service                                         loaded active running The nginx HTTP and reverse proxy server
  nis-domainname.service                                loaded active exited  Read and set NIS domainname from /etc/sysconfig/network
  polkit.service                                        loaded active running Authorization Manager
  postgresql-15.service                                 loaded active running PostgreSQL 15 database server
  qemu-guest-agent.service                              loaded active running QEMU Guest Agent
  rpc-statd-notify.service                              loaded active exited  Notify NFS peers of a restart
  rpcbind.service                                       loaded active running RPC Bind
  rsyslog.service                                       loaded active running System Logging Service
  serial-getty@ttyS0.service                            loaded active running Serial Getty on ttyS0
  slurmctld.service                                     loaded active running Slurm controller daemon
  slurmdbd.service                                      loaded active running Slurm DBD accounting daemon
  sshd.service                                          loaded active running OpenSSH server daemon
  systemd-boot-update.service                           loaded active exited  Automatic Boot Loader Update
  systemd-fsck@dev-disk-by\x2duuid-D33D\x2dDD3C.service loaded active exited  File System Check on /dev/disk/by-uuid/D33D-DD3C
  systemd-journal-flush.service                         loaded active exited  Flush Journal to Persistent Storage
  systemd-journald.service                              loaded active running Journal Service
  systemd-logind.service                                loaded active running User Login Management
  systemd-modules-load.service                          loaded active exited  Load Kernel Modules
  systemd-network-generator.service                     loaded active exited  Generate network units from Kernel command line
  systemd-random-seed.service                           loaded active exited  Load/Save Random Seed
  systemd-remount-fs.service                            loaded active exited  Remount Root and Kernel File Systems
  systemd-sysctl.service                                loaded active exited  Apply Kernel Variables
  systemd-tmpfiles-setup-dev.service                    loaded active exited  Create Static Device Nodes in /dev
  systemd-tmpfiles-setup.service                        loaded active exited  Create Volatile Files and Directories
  systemd-udev-trigger.service                          loaded active exited  Coldplug All udev Devices
  systemd-udevd.service                                 loaded active running Rule-based Manager for Device Events and Files
  systemd-update-utmp.service                           loaded active exited  Record System Boot/Shutdown in UTMP
  systemd-user-sessions.service                         loaded active exited  Permit User Sessions
  tuned.service                                         loaded active running Dynamic System Tuning Daemon
  user-runtime-dir@1000.service                         loaded active exited  User Runtime Directory /run/user/1000
  user@1000.service                                     loaded active running User Manager for UID 1000

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
53 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

About the ./run.sh - I used that initially actually, but was not sure if that was the required way of doing things, so I turned to the ansible playbook instead. In addition to having some issues also with run.sh on this system (forgot what they were right now).

mvdbeek commented 1 year ago

Can you include the whole startup sequence for Galaxy ?

mvdbeek commented 1 year ago

(the fact that the log ends in the middle of the startup sequence probably means galaxy wasn't able to start up fully and is being restarted)

maikenp commented 1 year ago

The log output from the startup?

mvdbeek commented 1 year ago

the galaxy logs are truncated, it doesn't include the start and the messages are also truncated ... there must be a journalctl option to get the full logs, but I can never remember and I don't use systemd.

mvdbeek commented 1 year ago

alternatively you can use journalctl -fu galaxy-gunicorn.service and wait until it restarts ... just before that there will be a reason given for why Galaxy can't start.

maikenp commented 1 year ago

galaxy_startup_dev.txt

mvdbeek commented 1 year ago

It's not done booting at this point, is the process still alive, and if yes, what is it doing ? (you can check with https://pypi.org/project/py-spy/, py-spy dump --pid <pid>)

maikenp commented 1 year ago

I stopped galaxy with galaxyctl stop

Checked that all galaxy-related processes were gone

[root@galaxy-arc-test ~]# ps aux | grep galaxy
root       27342  0.0  0.0   3876  2040 pts/1    S+   10:31   0:00 grep --color=auto galaxy

Then restarted with "restart"

And this is then what I have

[root@galaxy-arc-test ~]# ps aux | grep galaxy
root       27347  0.2  0.5 232696 47348 pts/0    S+   10:32   0:00 journalctl -fu galaxy-*
galaxy     27351 19.5  2.5 429508 207080 ?       Ss   10:32   0:02 /storage/srv/galaxy/venv/bin/python3 /storage/srv/galaxy/venv/bin/celery --app galaxy.celery beat --loglevel DEBUG --schedule /storage/srv/galaxy/var/gravity/celery-beat-schedule
galaxy     27352 20.1  2.6 455404 212436 ?       Ss   10:32   0:02 /storage/srv/galaxy/venv/bin/python3 /storage/srv/galaxy/venv/bin/celery --app galaxy.celery worker --concurrency 2 --loglevel DEBUG --pool threads --queues celery,galaxy.internal,galaxy.external
galaxy     27353 39.0  3.2 510904 267148 ?       Ss   10:32   0:05 /storage/srv/galaxy/venv/bin/python3 /storage/srv/galaxy/venv/bin/gunicorn galaxy.webapps.galaxy.fast_factory:factory() --timeout 300 --pythonpath lib -k galaxy.webapps.galaxy.workers.Worker -b unix:/storage/srv/galaxy/var/config/gunicorn.sock --workers=2 --config python:galaxy.web_stack.gunicorn_config --preload --forwarded-allow-ips=*
galaxy     27354 37.2  3.2 504808 262076 ?       Ss   10:32   0:04 /storage/srv/galaxy/venv/bin/python ./lib/galaxy/main.py -c /storage/srv/galaxy/config/galaxy.yml --server-name=handler_0 --attach-to-pool=job-handlers --attach-to-pool=workflow-schedulers
galaxy     27355 37.0  3.2 504808 262124 ?       Ss   10:32   0:04 /storage/srv/galaxy/venv/bin/python ./lib/galaxy/main.py -c /storage/srv/galaxy/config/galaxy.yml --server-name=handler_1 --attach-to-pool=job-handlers --attach-to-pool=workflow-schedulers
galaxy     27374  0.2  0.1  12900 10956 ?        S    10:32   0:00 /storage/srv/galaxy/venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(7)
postgres   27376  0.1  0.2 211900 18900 ?        Ss   10:32   0:00 postgres: galaxy galaxy [local] idle
postgres   27378 13.6  0.2 213848 23684 ?        Ss   10:32   0:00 postgres: galaxy galaxy [local] idle
postgres   27380 13.8  0.2 213848 23688 ?        Ss   10:32   0:00 postgres: galaxy galaxy [local] idle
postgres   27382 13.6  0.2 213848 23684 ?        Ss   10:32   0:00 postgres: galaxy galaxy [local] idle
root       27387  0.0  0.0   3876  2024 pts/1    S+   10:32   0:00 grep --color=auto galaxy
[root@galaxy-arc-test ~]# systemctl status galaxy-gunicorn^C
[root@galaxy-arc-test ~]# /usr/local/bin/py-spy dump --pid 27351
Process 27351: /storage/srv/galaxy/venv/bin/python3 /storage/srv/galaxy/venv/bin/celery --app galaxy.celery beat --loglevel DEBUG --schedule /storage/srv/galaxy/var/gravity/celery-beat-schedule
Python v3.9.16 (/usr/bin/python3.9)

Thread 27351 (idle): "MainThread"
    start (celery/beat.py:649)
    start_scheduler (celery/apps/beat.py:105)
    run (celery/apps/beat.py:77)
    beat (celery/bin/beat.py:72)
    caller (celery/bin/base.py:134)
    new_func (click/decorators.py:33)
    invoke (click/core.py:783)
    invoke (click/core.py:1434)
    invoke (click/core.py:1688)
    main (click/core.py:1078)
    __call__ (click/core.py:1157)
    main (celery/bin/celery.py:217)
    main (celery/__main__.py:15)
    <module> (celery:8)
[root@galaxy-arc-test ~]# /usr/local/bin/py-spy dump --pid 27352
Process 27352: /storage/srv/galaxy/venv/bin/python3 /storage/srv/galaxy/venv/bin/celery --app galaxy.celery worker --concurrency 2 --loglevel DEBUG --pool threads --queues celery,galaxy.internal,galaxy.external
Python v3.9.16 (/usr/bin/python3.9)

Thread 27352 (idle): "MainThread"
    drain_events (kombu/transport/virtual/base.py:976)
    drain_events (kombu/connection.py:316)
    synloop (celery/worker/loops.py:130)
    start (celery/worker/consumer/consumer.py:628)
    start (celery/bootsteps.py:116)
    start (celery/worker/consumer/consumer.py:332)
    start (celery/bootsteps.py:365)
    start (celery/bootsteps.py:116)
    start (celery/worker/worker.py:203)
    worker (celery/bin/worker.py:351)
    caller (celery/bin/base.py:134)
    new_func (click/decorators.py:33)
    invoke (click/core.py:783)
    invoke (click/core.py:1434)
    invoke (click/core.py:1688)
    main (click/core.py:1078)
    __call__ (click/core.py:1157)
    main (celery/bin/celery.py:217)
    main (celery/__main__.py:15)
    <module> (celery:8)
[root@galaxy-arc-test ~]# /usr/local/bin/py-spy dump --pid 27353
Process 27353: /storage/srv/galaxy/venv/bin/python3 /storage/srv/galaxy/venv/bin/gunicorn galaxy.webapps.galaxy.fast_factory:factory() --timeout 300 --pythonpath lib -k galaxy.webapps.galaxy.workers.Worker -b unix:/storage/srv/galaxy/var/config/gunicorn.sock --workers=2 --config python:galaxy.web_stack.gunicorn_config --preload --forwarded-allow-ips=*
Python v3.9.16 (/usr/bin/python3.9)

Thread 27353 (idle): "MainThread"
    acquire (galaxy/util/filelock.py:50)
    __enter__ (galaxy/util/filelock.py:68)
    ensure_installed (galaxy/tool_util/deps/installable.py:67)
    __init__ (galaxy/tool_util/deps/resolvers/conda.py:152)
    __load_plugins_from_dicts (galaxy/util/plugin_config.py:129)
    load_plugins (galaxy/util/plugin_config.py:59)
    __parse_resolver_conf_plugins (galaxy/tool_util/deps/__init__.py:362)
    __init__ (galaxy/tool_util/deps/__init__.py:146)
    build_dependency_manager (galaxy/tool_util/deps/__init__.py:105)
    _init_dependency_manager (galaxy/tools/__init__.py:609)
    __init__ (galaxy/tools/__init__.py:418)
    _configure_toolbox (galaxy/app.py:307)
    __init__ (galaxy/app.py:683)
    app_pair (galaxy/webapps/galaxy/buildapp.py:59)
    factory (galaxy/webapps/galaxy/fast_factory.py:63)
    import_app (gunicorn/util.py:424)
    load_wsgiapp (gunicorn/app/wsgiapp.py:48)
    load (gunicorn/app/wsgiapp.py:58)
    wsgi (gunicorn/app/base.py:67)
    setup (gunicorn/arbiter.py:118)
    __init__ (gunicorn/arbiter.py:58)
    run (gunicorn/app/base.py:72)
    run (gunicorn/app/base.py:236)
    run (gunicorn/app/wsgiapp.py:67)
    <module> (gunicorn:8)
[root@galaxy-arc-test ~]# /usr/local/bin/py-spy dump --pid 27354
Process 27354: /storage/srv/galaxy/venv/bin/python ./lib/galaxy/main.py -c /storage/srv/galaxy/config/galaxy.yml --server-name=handler_0 --attach-to-pool=job-handlers --attach-to-pool=workflow-schedulers
Python v3.9.16 (/usr/bin/python3.9)

Thread 27354 (idle): "MainThread"
    acquire (galaxy/util/filelock.py:50)
    __enter__ (galaxy/util/filelock.py:68)
    ensure_installed (galaxy/tool_util/deps/installable.py:67)
    __init__ (galaxy/tool_util/deps/resolvers/conda.py:152)
    __load_plugins_from_dicts (galaxy/util/plugin_config.py:129)
    load_plugins (galaxy/util/plugin_config.py:59)
    __parse_resolver_conf_plugins (galaxy/tool_util/deps/__init__.py:362)
    __init__ (galaxy/tool_util/deps/__init__.py:146)
    build_dependency_manager (galaxy/tool_util/deps/__init__.py:105)
    _init_dependency_manager (galaxy/tools/__init__.py:609)
    __init__ (galaxy/tools/__init__.py:418)
    _configure_toolbox (galaxy/app.py:307)
    __init__ (galaxy/app.py:683)
    load_galaxy_app (galaxy/main.py:91)
    app_loop (galaxy/main.py:112)
    main (galaxy/main.py:255)
    <module> (galaxy/main.py:259)
[root@galaxy-arc-test ~]# /usr/local/bin/py-spy dump --pid 27355
Process 27355: /storage/srv/galaxy/venv/bin/python ./lib/galaxy/main.py -c /storage/srv/galaxy/config/galaxy.yml --server-name=handler_1 --attach-to-pool=job-handlers --attach-to-pool=workflow-schedulers
Python v3.9.16 (/usr/bin/python3.9)

Thread 27355 (idle): "MainThread"
    acquire (galaxy/util/filelock.py:50)
    __enter__ (galaxy/util/filelock.py:68)
    ensure_installed (galaxy/tool_util/deps/installable.py:67)
    __init__ (galaxy/tool_util/deps/resolvers/conda.py:152)
    __load_plugins_from_dicts (galaxy/util/plugin_config.py:129)
    load_plugins (galaxy/util/plugin_config.py:59)
    __parse_resolver_conf_plugins (galaxy/tool_util/deps/__init__.py:362)
    __init__ (galaxy/tool_util/deps/__init__.py:146)
    build_dependency_manager (galaxy/tool_util/deps/__init__.py:105)
    _init_dependency_manager (galaxy/tools/__init__.py:609)
    __init__ (galaxy/tools/__init__.py:418)
    _configure_toolbox (galaxy/app.py:307)
    __init__ (galaxy/app.py:683)
    load_galaxy_app (galaxy/main.py:91)
    app_loop (galaxy/main.py:112)
    main (galaxy/main.py:255)
    <module> (galaxy/main.py:259)
[root@galaxy-arc-test ~]# /usr/local/bin/py-spy dump --pid 27374
Process 27374: /storage/srv/galaxy/venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(7)
Python v3.9.16 (/usr/bin/python3.9)

Thread 27374 (idle): "MainThread"
    main (multiprocessing/resource_tracker.py:189)
    <module> (<string>:1)
mvdbeek commented 1 year ago

It's waiting for a filelock to perform the conda setup. I think the training playbook has a separate task to set up conda. You can sidestep this and install conda manually at the right location, or set conda_auto_init: false in the galaxy config if you don't need conda right away.

maikenp commented 1 year ago

Ah right, thanks. I omitted the conda installation in fact as I had issues with the installation. I will try with the conda_auto_init setting, thanks!

That did it!

maikenp commented 1 year ago

Thanks a lot!