ARTbio / GalaxyKickStart

Ansible playbooks for Galaxy Server deployment
GNU General Public License v3.0
24 stars 22 forks source link

upgrade change set to release_16.07 #196

Closed drosofff closed 8 years ago

drosofff commented 8 years ago
mvdbeek commented 8 years ago

Nice, 👍 if the tests pass.

drosofff commented 8 years ago

@mvdbeek with the new conda and conda_ dependencies folder, do we have to code something in the playbook with regard to the cases where we move the data (galaxy_persistent_directory variable) ? I don't know whether it's related but I am currently testing with release_16.07 a galaxy_persistent_directory: /root/mydisk on an IFB instance and it fails at the end of the process:

TASK [galaxy.movedata : Start supervisor tasks after moving has finished.] *****
failed: [localhost] (item=galaxy:) => {"failed": true, "item": "galaxy:", "msg": "galaxy:galaxy_web: ERROR (abnormal termination)\n"}
changed: [localhost] => (item=proftpd)
ok: [localhost] => (item=postgresql)

NO MORE HOSTS LEFT *************************************************************
 [WARNING]: Could not create retry file 'galaxy.retry'.         [Errno 2] No such file or directory: ''

PLAY RECAP *********************************************************************
localhost                  : ok=99   changed=24   unreachable=0    failed=1

if I let the variable to /export, it works

I did not teste a change of /export for a while, thus I don't know whether the bug appeared recently or not

mvdbeek commented 8 years ago

I think there is a database migration involved when upgrading form 16.04 to 16.07. Can you check what's in uwsgi.log? There should be no need to take any actions regarding conda, since the default conda path is in <tool_dependencies>/_conda, which we are already exporting.

drosofff commented 8 years ago

Humm, tail-f uwsgi.log loops and complain:

migrate.versioning.script.base DEBUG 2016-09-16 10:07:18,131 Loading script lib/tool_shed/galaxy_install/migrate/versions/0012_tools.py...
migrate.versioning.script.base DEBUG 2016-09-16 10:07:18,131 Script lib/tool_shed/galaxy_install/migrate/versions/0012_tools.py loaded successfully
migrate.versioning.repository DEBUG 2016-09-16 10:07:18,132 Repository lib/tool_shed/galaxy_install/migrate loaded successfully
migrate.versioning.repository DEBUG 2016-09-16 10:07:18,132 Config: OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), ('repository_id', 'GalaxyTools'), ('version_table', 'migrate_tools'), ('required_dbs', '[]')]))])
Traceback (most recent call last):
  File "lib/galaxy/webapps/galaxy/buildapp.py", line 55, in paste_app_factory
    app = galaxy.app.UniverseApplication( global_conf=global_conf, **kwargs )
  File "lib/galaxy/app.py", line 64, in __init__
    self._configure_models( check_migrate_databases=True, check_migrate_tools=check_migrate_tools, config_file=config_file )
  File "lib/galaxy/config.py", line 921, in _configure_models
    verify_tools( self, install_db_url, config_file, install_database_options )
  File "lib/tool_shed/galaxy_install/migrate/check.py", line 43, in verify_tools
    latest_tool_migration_script_number )
  File "lib/tool_shed/util/common_util.py", line 34, in check_for_missing_tools
    tool_shed_url = get_tool_shed_url_from_tool_shed_registry( app, tool_shed )
  File "lib/tool_shed/util/common_util.py", line 236, in get_tool_shed_url_from_tool_shed_registry
    for shed_url in app.tool_shed_registry.tool_sheds.values():
AttributeError: 'NoneType' object has no attribute 'tool_sheds'
Fri Sep 16 10:07:24 2016 - *** Starting uWSGI 2.0.13.1 (64bit) on [Fri Sep 16 10:07:24 2016] ***
Fri Sep 16 10:07:24 2016 - compiled with version: 4.8.4 on 16 September 2016 10:05:45
Fri Sep 16 10:07:24 2016 - os: Linux-3.13.0-65-generic #106-Ubuntu SMP Fri Oct 2 22:08:27 UTC 2015
Fri Sep 16 10:07:24 2016 - nodename: vm0062
Fri Sep 16 10:07:24 2016 - machine: x86_64
Fri Sep 16 10:07:24 2016 - clock source: unix
Fri Sep 16 10:07:24 2016 - detected number of CPU cores: 4
Fri Sep 16 10:07:24 2016 - current working directory: /home/galaxy/galaxy
Fri Sep 16 10:07:24 2016 - detected binary path: /home/galaxy/galaxy/.venv/bin/uwsgi
Fri Sep 16 10:07:24 2016 - !!! no internal routing support, rebuild with pcre support !!!
Fri Sep 16 10:07:24 2016 - your processes number limit is 63647
Fri Sep 16 10:07:24 2016 - your memory page size is 4096 bytes
Fri Sep 16 10:07:24 2016 - detected max file descriptor number: 1024
Fri Sep 16 10:07:24 2016 - lock engine: pthread robust mutexes
Fri Sep 16 10:07:24 2016 - thunder lock: disabled (you can enable it with --thunder-lock)
Fri Sep 16 10:07:24 2016 - uwsgi socket 0 bound to TCP address 127.0.0.1:4001 fd 3
Fri Sep 16 10:07:24 2016 - Python version: 2.7.6 (default, Jun 22 2015, 18:01:27)  [GCC 4.8.2]
Fri Sep 16 10:07:24 2016 - Set PythonHome to /home/galaxy/galaxy/.venv
Fri Sep 16 10:07:24 2016 - Python main interpreter initialized at 0x22e5390
Fri Sep 16 10:07:24 2016 - python threads support enabled
Fri Sep 16 10:07:24 2016 - your server socket listen backlog is limited to 100 connections
Fri Sep 16 10:07:24 2016 - your mercy for graceful operations on workers is 60 seconds
Fri Sep 16 10:07:24 2016 - mapped 166144 bytes (162 KB) for 2 cores
Fri Sep 16 10:07:24 2016 - *** Operational MODE: threaded ***
Fri Sep 16 10:07:24 2016 - added lib/ to pythonpath.
Fri Sep 16 10:07:24 2016 - Loading paste environment: config:/home/galaxy/galaxy/config/galaxy.ini
DEBUG:galaxy.app:python path is: lib/, ., , /home/galaxy/galaxy/.venv/lib/python2.7, /home/galaxy/galaxy/.venv/lib/python2.7/plat-x86_64-linux-gnu, /home/galaxy/galaxy/.venv/lib/python2.7/lib-tk, /home/galaxy/galaxy/.venv/lib/python2.7/lib-old, /home/galaxy/galaxy/.venv/lib/python2.7/lib-dynload, /usr/lib/python2.7, /usr/lib/python2.7/plat-x86_64-linux-gnu, /usr/lib/python2.7/lib-tk, /home/galaxy/galaxy/.venv/local/lib/python2.7/site-packages
INFO:galaxy.config:Logging at '10' level to 'stdout'
galaxy.queue_worker INFO 2016-09-16 10:07:26,254 Initializing main Galaxy Queue Worker on sqlalchemy+postgresql://galaxy:********@localhost:5432/galaxy?client_encoding=utf8
galaxy.app DEBUG 2016-09-16 10:07:26,276 Using "galaxy.ini" config file: /home/galaxy/galaxy/config/galaxy.ini
migrate.versioning.repository DEBUG 2016-09-16 10:07:26,310 Loading repository lib/galaxy/model/migrate...
mvdbeek commented 8 years ago

Ah, this may be because the old instance doesn't reference the the new location of tool_sheds_conf.xml. In any case AttributeError: 'NoneType' object has no attribute 'tool_sheds' means something is wrong with the tool_sheds_conf.xml.

drosofff commented 8 years ago

Ah, this may be because the old instance doesn't reference the the new location of tool_sheds_conf.xml. Humm... I am using fresh new brand IFB instances. In any case AttributeError: 'NoneType' object has no attribute 'tool_sheds' means something is wrong with the tool_sheds_conf.xml. Huggs ! it use to work yesterday in my tests without moving the data. I am sad

mvdbeek commented 8 years ago

OK, I think for now we can handle this with an additional task during the import. That task will check if tool_sheds_config_file exists in the files to be imported, and if it does not exist will copy over the file to the exported path before importing. Can I push to this branch?

drosofff commented 8 years ago

Can I push to this branch? Sure !

mvdbeek commented 8 years ago

@drosofff I have tested this, but not with an old instance. Seems to work. If you just re-run the updated playbook it should pass where it failed previously.

drosofff commented 8 years ago

Does not seem to work yet:

TASK [galaxy.movedata : Copy updated config files to export if they don't exist] ***
fatal: [localhost]: FAILED! => {"changed": false, "cmd": "'cp /home/galaxy/galaxy/config/tool_sheds_conf.xml\n/root/mydisk//home/galaxy/galaxy/config/tool_sheds_conf.xml'", "failed": true, "msg": "[Errno 2] No such file or directory", "rc": 2}
drosofff commented 8 years ago

After manually fixing that in galaxy, there is still:

galaxy.tools.deps DEBUG 2016-09-16 14:51:23,343 Unable to find config file './dependency_resolvers_conf.xml'
mvdbeek commented 8 years ago

galaxy.tools.deps DEBUG 2016-09-16 14:51:23,343 Unable to find config file './dependency_resolvers_conf.xml'

this is normal, and supposed to be an info: https://github.com/galaxyproject/galaxy/pull/2801#issuecomment-240203913

drosofff commented 8 years ago

Yes I saw the link this morning... but as far I can see it's not only an info, it blocks the galaxy start

mvdbeek commented 8 years ago

Yes I saw the link this morning... but as far I can see it's not only an info, it blocks the galaxy start

No, this is when conda is getting installed, which takes some time. But if you abort the installation, you will have a lock file lying around that needs to be deleted first. (conda.lock, in the tool_dependencies dir)

drosofff commented 8 years ago

I still have this

==> /var/log/supervisor/handler0-stderr---supervisor-ba9how.log <==
    self._configure_models( check_migrate_databases=True, check_migrate_tools=check_migrate_tools, config_file=config_file )
  File "/home/galaxy/galaxy/lib/galaxy/config.py", line 921, in _configure_models
    verify_tools( self, install_db_url, config_file, install_database_options )
  File "/home/galaxy/galaxy/lib/tool_shed/galaxy_install/migrate/check.py", line 43, in verify_tools
    latest_tool_migration_script_number )
  File "/home/galaxy/galaxy/lib/tool_shed/util/common_util.py", line 34, in check_for_missing_tools
    tool_shed_url = get_tool_shed_url_from_tool_shed_registry( app, tool_shed )
  File "/home/galaxy/galaxy/lib/tool_shed/util/common_util.py", line 236, in get_tool_shed_url_from_tool_shed_registry
    for shed_url in app.tool_shed_registry.tool_sheds.values():
AttributeError: 'NoneType' object has no attribute 'tool_sheds'

the tool_sheds_conf.xml file does not install properly (not even changing /export)

mvdbeek commented 8 years ago

What changeset are you on? can you post the ansible log?

mvdbeek commented 8 years ago

The build is still going to fail with the permissions problem of the conda role, I'll continue tomorrow.