Closed natefoo closed 1 year ago
in Galaxy installation with Ansible tutorial the part Galaxy is now configured with an admin user, a database, and a place to store data. Additionally we’ve immediately configured the mules for production Galaxy serving. So we’re ready to set up supervisord which will manage the Galaxy processes!
hands_on Hands-on: (Optional) Launching uWSGI by hand
SSH into your server
Switch user to Galaxy account (sudo -iu galaxy)
Change directory into /srv/galaxy/server
Activate virtualenv (. ../venv/bin/activate)
uwsgi --yaml ../config/galaxy.yml
Access at port <ip address>:8080 once the server has started
is duplicated.
@lldelisle thanks!
@lldelisle That was fixed already in #1810
validate job xml etc against the definition
In: https://training.galaxyproject.org/training-material/topics/admin/tutorials/connect-to-compute-cluster/tutorial.html#a-dynamic-destination Use different name for the group id
@lldelisle thanks, we added this as "Stop re-using IDs between sections (aka don't use the same values for runner IDs, destination IDs, job resource IDs, etc."
Writing in my own comment, lest any updates conflict or be ovewritten
typo in https://galaxyproject.github.io//training-material/topics/admin/tutorials/pulsar/tutorial.html#testing-pulsar
journalctcl -fu galaxy
instead of journalctl -fu galaxy
typo in https://galaxyproject.github.io//training-material/topics/admin/tutorials/pulsar/tutorial.html#testing-pulsar
journalctcl -fu galaxy
instead ofjournalctl -fu galaxy
Thanks, will be fixed by https://github.com/galaxyproject/training-material/pull/1822
Connect to compute Citing from the hands-on tutorial:
if the folder does not exist, create files/galaxy/config next to your playbook.yml (mkdir -p files/galaxy/config/)
The playbook name should probably change to galaxy.yml, since other tutorials reference it.
Thanks @ondrejme!
change the short help of local gxadmins:
https://training.galaxyproject.org/training-material/topics/admin/tutorials/gxadmin/tutorial.html
local_hello() { ## hello: Says hi
-> local_hello() { ## : Says hi
local_query-latest() { ## query-latest [jobs|10]: Queries latest N jobs (default to 10)
-> local_query-latest() { ## [jobs|10]: Queries latest N jobs (default to 10)
"Invalid username or password" when grafana starts, maybe due to: grafana_url: "https:///grafana/"
in https://training.galaxyproject.org/training-material/topics/admin/tutorials/monitoring/tutorial.html
Connect to compute Citing from the hands-on tutorial:
if the folder does not exist, create files/galaxy/config next to your playbook.yml (mkdir -p files/galaxy/config/)
The playbook name should probably change to galaxy.yml, since other tutorials reference it.
@ondrejme Thanks, it will be addressed by https://github.com/galaxyproject/training-material/pull/1829 .
In https://training.galaxyproject.org/training-material/topics/admin/tutorials/ansible-galaxy/tutorial.html#postgresql
At the beginning of the tutorial (when setting postgres) we had in group_vars/galaxyservers.yml
# Python 3 support
pip_virtualenv_command: /usr/bin/python3 -m virtualenv # usegalaxy_eu.certbot, usegalaxy_eu.tiaas2, galaxyproject.galaxy
certbot_virtualenv_package_name: python3-virtualenv # usegalaxy_eu.certbot
pip_package: python3-pip # geerlingguy.pip
Then when we set galaxy_config and uwsgi the solution shows something which begins by:
# python3 support
pip_virtualenv_command: virtualenv
I guess this is not expected...
In the same solution, it is written:
galaxy_user: {name: galaxy, shell: /bin/bash, home: "{{ galaxy_root }}"}
Whereas in the table above it is written:
{name: galaxy, shell: /bin/bash}
home: "{{ galaxy_root }}"}
Wow, @lldelisle you found it. It looks like I added it, a long time ago. I really don't know how that happened. Ok, amazing, thank you. We will make sure those snippets are in sync in the future.
I found a journalctf -u galaxy -f
instead of journalctl -u galaxy -f
in https://training.galaxyproject.org/training-material/topics/admin/tutorials/tiaas/tutorial.html#setting-up-tiaas
I found a
journalctf -u galaxy -f
instead ofjournalctl -u galaxy -f
in https://training.galaxyproject.org/training-material/topics/admin/tutorials/tiaas/tutorial.html#setting-up-tiaas
Fixed already in https://github.com/galaxyproject/training-material/pull/1836 , thanks for reporting anyway!
gxit - leading spaces in paste
gxit - leading spaces in paste
https://github.com/galaxyproject/training-material/pull/1842
Hands-on: Enabling Interactive Tools in Galaxy
Step3:
I would suggest changing order if "id" and "destination" in
Step4:
interactivetools_enable: "True"
remove quotation marks and make the capital letter small
in https://training.galaxyproject.org/training-material/topics/admin/tutorials/ansible-galaxy/tutorial.html
If you want not to use ssl, I guess you also need to change the templates/nginx/galaxy.j2
because:
# Listen on port 443
listen *:443 ssl default_server;
Will not work, right?
@lldelisle If you changed this to listen *:80 default_server;
, you should also move this template from nginx_ssl_servers
to nginx_servers
, remove redirect-ssl
from nginx_servers
, and comment nginx_ssl_role
. You would also need to remove /etc/nginx/sites-enabled/redirect-ssl
. You could do this with a pre_task
like:
- name: Remove redirect-ssl config
file:
path: /etc/nginx/sites-enabled/redirect-ssl
state: absent
Many thanks... So the only think which is missing in the training material is: change
# Listen on port 443
listen *:443 ssl default_server;
to
# Listen on port 80
listen *:80 default_server;
If you ran the playbook once with redirect-ssl before deciding to do not use SSL, remove the file /etc/nginx/sites-enabled/redirect-ssl
.
In https://training.galaxyproject.org/training-material/topics/admin/tutorials/connect-to-compute-cluster/tutorial.html: You wrote: Add a post_task to your playbook to install slurm-drmaa1 (Debian/Ubuntu) or slurm-drmaa (RedHat/CentOS), and additionally include the galaxyproject.repos role Then maybe you could use:
post_tasks:
- name: Install slurm-drmaa1 if Debian
package:
name: slurm-drmaa1
when: ansible_os_family == "Debian"
- name: Install slurm-drmaa if RedHat
package:
name: slurm-drmaa
when: ansible_os_family == "RedHat"
(If I undertood well...)
To myself: ansible_python.version.major
combination of statements and opinions from @natefoo @Slugger70 @mvdbeek @nsoranzo @hexylena and @shiltemann, synthesized into one summary/todo list.
This training was fantastic! And incredibly strange, things worked! Like flawlessly nearly. We got through 5 days of content in 3. We had to come up with an extra 2 days.
A notable difference this time was how many students tried to run the playbooks immediately on their own infrastructure, either from the start on their own VMs, or after class on their own infra. Despite asking everyone to run it on the VM, we also had a couple of people brave enough to run from their own laptop, mostly without issue.
All around great set of participants! But it led us to focus on areas we need to improve the materials
From @natefoo:
an idea I had: two column design on the tutorials where one column is the things you do in ansible and the other column is the effects it has on the system
this latest training went well but at times it felt very black-boxish, "just run these things and voila!"
For something like the ansible tutorial we could show a
$ cat /tmp/test.txt
some contents
In something like the galaxy tutorial we'd show all the changes to the system that each step makes. I'd say something like the latest commit on the release_XX.YY branch has been cloned to /srv/galaxy/server
In order to reduce how much it needs to be updated, we will just use this in the first two trainings where we need to show this effect (ansible, ansible-galaxy).
The students can then see the differences the ansible is making and gain the understanding to help enable them to troubleshoot.. As things never always "just work", especially when running on varied or outsourced hardware, with the large viariety of quality of tools etc..
We noted that a few students had issues with how ansible really works, variables being set in different places, which changes have which effects. So we're considering adding "real" exercises or hide a bit more the answers for some of the ones we already have.
It's a tough balance to strike. For most of the questions & answers in ansible-galaxy, they're awful, they ask "how does your final config look" and everyone just copies that. Maybe we should rewrite them as "Here is the config." and ask better questions??? "what does this do?" "what effect will that have?"
We should show the students Ansible Best Practices at some point? Before the training? Or after the 1st day? https://docs.ansible.com/ansible/latest/user_guide/playbooks_best_practices.html
And we should consider developing "Ansible - advanced" or an ansible "exam" (CTF?) for the students, saying "ok, now that you know ansible, accomplish these tasks"
I also think that sometimes "just re-run the playbook" isn't enough.. Figuring out why something has changed can sometimes be more important for the big picture than how to do it. (If that makes sense.)
I think there's a continuum, at one end is "galaxy of a few years ago where people needed to be programmers/tool devs/admins together, and we needed to teach everything in detail so they could debug" and the other end is "galaxy (of the current/ future) where things mostly just work, and they can just deploy it and not care too much since the documentation / tutorials cover all of the main points, and they don't resort to low level debugging"
If we're really moving to the "just works" end, maybe we remove that detail from the curriculum because it doesn't benefit students vs a higher level picture.
I think if they're gonna go back and not use ansible it's good to show "here's what this production deployment looks like" so they can adapt it for their own purposes
We sympathise with "ought to get an in-depth understanding", but:
It's two sides of a coin... people coming to a week long training probably ought to come away with a pretty low level understanding - but we've also found that it's really difficult to teach that low level understanding, especially to folks who mostly aren't sysadmins.
Which leads us to the next question:
What should students come away from GAT knowing how to do?
everything else is less important?
We should include more on the splitting of roles amongst machines, and write them in a way they can be used as-is. E.g. transitioning from ident
auth to network auth is complex (see next aside). A number of participants tried deploying the playbook on their own systems toward the end of the week and some struggled with getting the proper DB configuration.
So db on separate server as an example and how to setup the ansible to do things like that. And talk about production setups for a large user base in detail. The benefits of automation for larger setups and some examples of tool maintenance etc.
There are now I think two different places in the tutorials where we say "if we were really doing best practices we'd create a new group and put vars in a different group vars file," maybe we should just do that,
I'd see the following splitting for the whole week:
In ansible-galaxy, only one split, db + galaxy that sounds manageable. And it is a good place to introduce this concept of "here is where you can divide your infrastructure"
let's bind to 127, and use md5, and make everyone use passwords. I think that would be a positive change over ident magic. (I mean, I love ident, but, it's difficult to switch / not obvious for students)
WIP implementation of the side-by-side discussed during admin debriefing
@annefou this might be interesting for you, too! Do you have any feedback on this? Authors have the choice of
- CVMFS/ref data
- Make proper tutorial of this
https://github.com/galaxyproject/training-material/pull/3778
In general I think enough of this is done to finally close it out.
An issue for collecting things we notice during the 2020 Galaxy Admin Training in Barcelona that need to be fixed
files/
on dir tree on "roles" slide.j2
suffix is used to indicate that this is a template file in Jinja 2 format. After filling the template with the variable values, we copy the file to its remote destination without the.j2
suffixpackage
module'sname
option rather thanwith_items
loop
instead ofwith_items
geerlingguy
Ansible Galaxy roles, we should also recommendgalaxyproject
andusegalaxy_eu
.when: service_conf.changed
should bewhen: service_conf is changed
group_vars/galaxy.yml
for storing all of the Galaxy configuration." - should be.group_vars/galaxyservers.yml
galaxy_config_style
should default toyaml
in the rolegalaxy.yml
should not be world readable (but to change this, the config needs to be readable by groupgalaxy
)admin_users
a list, which doesn't work (I thought we ran the value of this throughutil.listify()
but maybe not?)/data
cleanup_job: never
and looking in/srv/galaxy/jobs
(or maybe set it toonsuccess
and run a job that would surely fail, e.g. due to missing dependencies)integrated_tool_panel.xml
after or into "Toolpanel management" slidelibrary_import_dir
is here)galaxy_config
booleans as strings (i.e."True"
,"False"
)??--ntasks=1 --cpus-per-task=4
rather than--ntasks=4
map_resources.py
: https://gist.github.com/natefoo/bbcfc162fad83cbc31bc98d82dbfd1c8Use standalone vars for DTD config and job resource param file paths (as is done with job config file path) and rearrange these copy boxes so they're in the same order as the job config file one (actually - fenced diff block here is probably preferable so you can see that you're adding to existing vars - should do this across tutorials modifying group vars, including interactive tools, CVMFS)see "Decide how to handle..."galaxy_shed_tools_dir, galaxy_tool_dependency_dir, galaxy_file_path, galaxy_job_working_directory, galaxy_server_dir, galaxy_venv_dir
. We should probably update the Installing tutorial to put these all on some distinct path (e.g./data
, but rename to/clusterFS
or something). And maybe there should be a layout in galaxyproject.galaxy that does this.files
directory~ Given that Binder seem to clone the entire GitHub repository, it seems better to keep the notebooks in a separate small repo.max_percent_full
job_conf.xml.sample_advanced
split_logging
config var that automatically sets upfilename_template
logging as described in advanced logging configurationjob_runner_name
with (or add column)handler
ingxadmin query job-info
Not admin-related: