Closed hexylena closed 2 years ago
More of a ansible-galaxy thing, but datasets should be stored by uuid, not id by default.
We should use vault and set a secret-id for the rest of the training, not just day1
From https://github.com/galaxyproject/training-material/issues/989 :
intro
directory, I had to rsync my new ~/roles
folder to intro
templates/galaxy/config/job_metrics_conf.xml.j2
gantsign.golang role uses deprecated sha256sum
parameter, which will be removed in ansible.base 2.14 (use checksum
instead).
I thought the spaces before + are indentations..
yeah we def need a copy+paste view of the diff. I've got some JS that does an OK job, just need to make the presentation better. Then we'll havea a button to switch between "the real diff" vs "here are the lines you need to add."
add validation for xml files pre-restart.
specifiying that all commands (including andible-galaxy) should be run in the intro directory, I had to rsync my new ~/roles folder to intro
fixed by using diffs everywhere I guess.
I was confused at first by the "service" service. More real, less abstract examples would be clearer, IMO
fair, but, it's also just to learn about ansible. not sure.
templates/nginx/galaxy.j2 -> "uwsgi_pass 127.0.0.1:8080" should not be configured statically and changed to a variable from the groups_vars if the port is changed there in the uwsgi variable settings
I think we want to integrate the systemd role into the galaxy role (which we should've done a while ago.) then this step will be skipped completely and simply not possible to do, which will resolve the many, many issues people have running uwsgi by hand (ports, permissions, etc)
note about using non- let's encrypt certificate
We can link to https://github.com/galaxyproject/ansible-nginx#ssl-configuration in a box, but I don't think it's our responsibility to say much more?
specifying that all commands (including ansible-galaxy) should be run in the intro directory, I had to rsync my new ~/roles folder to intro
fixed by using diffs everywhere I guess.
I think this is more a matter of adding some cd ~/intro/
or something like that?
I'm coming at this as a non-galaxy user so jumping straight into the interface was initially a bit confusing, a quick video tour of the Galaxy interface (~5 minutes) beforehand would have made this easier for me
We could add https://training.galaxyproject.org/training-material/topics/introduction/tutorials/galaxy-intro-short/tutorial.html to the program (when running the 5-days course).
:+1:, they should learn to be users themselves. I think for physical courses we've mostly filtered for people who are already running a small galaxy, but, online we have many more new people.
Once you cd into the directory, autofs will automatically mount the repository and files will be listed.
that was confusing, it's in the solution box but refers to the step after it, we should reword
https://github.com/galaxyproject/training-material/pull/2241 changed the ephemeris tutorial to install pilon instead of bwa, but bwa is used in the following CVMFS tutorial. Revert? In the mean time, I'm going to add instructions to install bwa to the CVMFS tutorial.
@nsoranzo it looks like the pulsar tutorial also assumes bwa is installed [Updated] The pulsar tutorial already has instructions for installing bwa
@cat-bro Is it fine to get back to bwa in the ephemeris tuto?
https://training.galaxyproject.org/training-material/topics/admin/tutorials/data-library/tutorial.html#from-history has the wrong screenshot, needs a tip about "if you use a diff email"
@nsoranzo I don't know. Given that half of the students will have already done this tutorial, it might be more confusing to revert it at this point? The half that haven't would be left with a video tut that is different from the document.
Either way, a tip for installing BWA is probably needed in both cvmfs and pulsar tutorials
We can do a tip now (maybe make it a snippet that we can generically include in both) and then decide later on one or the other maybe?
remove galaxy_zergpool_listen_addr
from raining
@hexylena @nsoranzo: re bwa (1) it seems to be ok following the cvmfs tutorial text, since the direction is to look at BWA or bowtie2, whichever is installed, and they do have bowtie2 (installed from the workflow tool list in the ephemeris tutorial). (2) the pulsar tutorial already has instructions for installing bwa.
@hexylena @nsoranzo: re bwa (1) it seems to be ok following the cvmfs tutorial text, since the direction is to look at BWA or bowtie2, whichever is installed, and they do have bowtie2 (installed from the workflow tool list in the ephemeris tutorial).
True, but the CVMFS hands-on at point 4. says "Login to Galaxy as the admin user, and go to Admin → Data Tables → bwa_mem indexes" which doesn't make sense if you run bowtie2.
(2) the pulsar tutorial already has instructions for installing bwa.
Yes, I'm making a snippet out of that.
True, but the CVMFS hands-on at point 4. says "Login to Galaxy as the admin user, and go to Admin → Data Tables → bwa_mem indexes" which doesn't make sense if you run bowtie2.
This step does work without bwa being installed.
From an attendee:
I have found something confusing at the tutorial https://training.galaxyproject.org/training-material/topics/admin/tutorials/cvmfs/tutorial.html At Hands-on: Installing CVMFS with Ansible 3 - Edit the group variables file, group_vars/galaxyservers.yml: Here it says that this variables can be included at group_vars/all.yml So, I am not sure if I need to edit anything at group_vars/galaxyservers.yml
The complete sentence is "Add the following lines to your group_vars/all.yml file, creating it if it doesn’t exist" but above it says "Edit the group variables file, group_vars/galaxyservers.yml
" (the latter seems wrong to me).
Running Jobs on Remote Resources with Pulsar - job_metrics_config_file option missing #2302
Add all this crap. I setup galaxy for work and used it in anger..... I was shocked at how much was missing that I expect to be there. We should be setting many of these at some point during the week.
# Perf
database_engine_option_server_side_cursors: true
slow_query_log_threshold: 5
enable_per_request_sql_debugging: true
nginx_x_accel_redirect_base: /_x_accel_redirect
# watchdog library too?
watch_tools: 'auto'
watch_job_rules: 'auto'
# Admin convenience
allow_path_paste: true
library_import_dir: /data/library
enable_quotas: true
cleanup_job: onerror
allow_user_deletion: true
allow_user_impersonation: true
# user convenience
show_welcome_with_login: true
expose_user_name: true
expose_dataset_path: true
expose_potentially_sensitive_job_metrics: true
# Other
outputs_to_working_directory: true
True, I was also comparing the outcome of the ansible-galaxy tutorial with what I have in production and noticed some of these major missing bits.
templates/galaxy/config/object_store_conf.xml
in the Distributed Object Storage tutorial should be a .xml.j2
.
In Running Jobs on Remote Resources with Pulsar the variable galaxy_server_url
should be named galaxy_server_hostname
or galaxy_server_address
or something similar, since it's an FQDN (or IP) rather than a URL.
From the Slack:
Note for "Galaxy Monitoring with Reports". Step 4 should put the location of the reports app in the templates/nginx/galaxy.j2 file, not in group_vars/galaxyservers.yml
cgroup-tools missing from job metrics.
tiaas
We next need to configure this plugin in our job configuration (files/galaxy/config/job_conf.xml): Should be templates/galaxy/config/job_conf.xml.j2 to match rest of training?
reports
tip box on how to secure reports
GxIT training breaks Singularity training (the container_resolvers_conf.xml
is invalid for Docker).
So what exactly is a “sensible” value for this? Currently I am using data mangers for a select number of references. My biggest item as of now is a 32GB RNAStar index of HG38.
on my producion machine i have a 100gb cache (todo: xref playbooks)
watch_tools: 'auto'
That's only needed if you dump tools into a directory and expect new tools to show up. I wouldn't enable this on a production instance.
Singularity and volume binding: When adding more object stores, it seems job_conf:$singularity_defaults
is not populated with these new paths?
Fix: Add a singularity_volumes
parameter to job_conf.xml
, to include the new data volume(s): https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/config/sample/job_conf.xml.sample_advanced#L566
(Or make Galaxy add these automagically?)
from @Slugger70 on email
Erlang and Rabbit have a very “interesting” relationship. Different versions of RabbitMQ are very dependent on a particular version of Erlang. I have to pin the versions to be installed in my playbooks as the defaults don’t always work.
can you share those pins back and let's get them in the playbooks so users don't have these issues? (if that's possible)
I'm happy to take on making some small changes to the ephemeris tutorial.
(1) Switch the installed tool back to bwa and choose the tool for the testing step from any of the installed tools (bwa, bowtie2, bam_filter etc).
(2) feedback comment: The flow of the tutorial feels awkward in places - you extract the workflow but then install a tool singly before going back to the extracted .yml to do a batch.
I'm wondering whether switching order of steps from what it currently is: "workflow-to-tools", "install one tool", "install workflow tools" to "install one tool", "workflow-to-tools", "install workflow tools" would help with this.
(3) feedback comments Not directly related to this tutorial but coming from the previous Galaxy setup tutorials, I'm left thinking - what happened to Ansible and the concept of reinstalling the entire Galaxy in one playbook?
and It could be explained how to include this tasks in the ansible playbook (if possible) in the case of a full re-installation of Galaxy. Or maybe better separate the two steps...
I remember feeling the same way when I first took galaxy admin courses. Now, having spent more time with galaxy I see tool management as something separate that I would not include in a set of infrastructure playbooks. Maybe to address this there could be a slide talking about the Galaxy API and how there are lots of things we want to be able to do outside of the main infrastructure setup.
Fantastic! That sounds great! I like the new ordering that's proposed
maybe add https://docs.galaxyproject.org/en/latest/admin/nginx.html#receiving-files-with-nginx too? it's better for performance
Chunked uploading fixes this for the UI. The cases where this would still be useful are scripted uploads and, if you set it up for the job files API (not sure if we documented this anywhere but the .org configs have it), Pulsar transfers. I think we would want a means for dynamically compiling the upload module before adding this as well since our nginx packages with the static module are not well maintained.
check that the file sending we added actually works
I used to have manual verification with this using wget/curl in the salt lake version of the tutorial, I'll try to dig it up and see if we can automate it.
Chunked uploading fixes this for the UI.
It's still not great for performance since the individual chunks still need to pass through the web handlers, the old upload module or https://github.com/pgaertig/nginx-big-upload are better for overall performance.
Chunked uploading fixes this for the UI
the performance was complete garbage at EU. Nginx would buffer it once, uwsgi would re-buffer it again to a different location, then the chunked module would reassemble. Swapping to nginx made a massive difference in web handler responsiveness during big uploads (since we were trashing our disk less too)
I think we would want a means for dynamically compiling the upload module before adding this as well since our nginx packages with the static module are not well maintained.
Oz has a local role (ubuntu-oriented) for this that we use alongside the galaxyproject.nginx role: https://github.com/usegalaxy-au/infrastructure/tree/master/roles/nginx-upload-module
@cat-bro awesome, thank you!
validation for nginx config in nginx role. @natefoo
galaxyproject/ansible-nginx#11
The version of rabbit that we use needs to be set to 3.8.16
from @Slugger70 on email
Erlang and Rabbit have a very “interesting” relationship. Different versions of RabbitMQ are very dependent on a particular version of Erlang. I have to pin the versions to be installed in my playbooks as the defaults don’t always work.
can you share those pins back and let's get them in the playbooks so users don't have these issues? (if that's possible)
We need to overide the version of rabbitmq that we install to the latest one (3.8.16) that matches the new default erlang (24.x) install in Ubuntu 20.04. Otherwise any apt update && apt upgrade
will install the latest erlang no matter which version we have pinned in the playbook.
I will add the version to the Pulsar tutorial.
See: https://www.rabbitmq.com/which-erlang.html for details.
@Slugger70 I wonder if we should use one of the repos in the RabbitMQ install docs for Debian/Ubuntu to install an updated Erlang? They make it sound like it should install the correct Erlang for the selected RabbitMQ if you're using Cloudsmith (or maybe Packagecloud?), but maybe it doesn't work so magically as they imply.
Running Jobs on Remote Resources with Pulsar - job_metrics_config_file option missing #2302
WIP, please comment: galaxyproject/ansible-galaxy#133
If this looks like the way forward we can just default pulsar_job_metrics_plugins
to galaxy_job_metrics_plugins
after updating the Pulsar role to use the new syntax and write YAML configs instead of XML.
Some issues moved to https://github.com/galaxyproject/training-material/issues/2583
Misc
Validation
Role Changes
VM Environment
gat
to theHelpers.md
pageTraining Updates
Tutorial Feedback
anyone
From https://github.com/galaxyproject/training-material/issues/989 :
intro
directory, I had to rsync my new~/roles
folder tointro
templates/galaxy/config/job_metrics_conf.xml.j2
Missing Config Options
Details
``` # Perf database_engine_option_server_side_cursors: true slow_query_log_threshold: 5 enable_per_request_sql_debugging: true nginx_x_accel_redirect_base: /_x_accel_redirect # watchdog library too? watch_tools: 'auto' # That's only needed if you dump tools into a directory and expect new tools to show up. I wouldn't enable this on a production instance. watch_job_rules: 'auto' # Admin convenience allow_path_paste: true library_import_dir: /data/library enable_quotas: true cleanup_job: onerror allow_user_deletion: true allow_user_impersonation: true # user convenience show_welcome_with_login: true expose_user_name: true expose_dataset_path: true expose_potentially_sensitive_job_metrics: true # Other outputs_to_working_directory: true ```Diffs
Testing
I will not organise this again without a testing strategy. We had many times were updates required changes to earlier tutorials, which then you do not know if things will work from scratch, or from your already modified machine.
We need molecule tests end to end.