galaxyproject / training-material

A collection of Galaxy-related training material
https://training.galaxyproject.org
MIT License
294 stars 846 forks source link

New toolfactory tutorial for dev #2428

Closed fubar2 closed 2 years ago

fubar2 commented 3 years ago

(PR #2423 shifted to a branch and incorporating a fix to exclude docker/ html files during html linting - thanks @hexylena)

An up-to-date web version of this material is available: https://training.galaxy.lazarus.name/training-material/topics/dev/tutorials/tool-generators/tutorial.html

This tutorial is designed to help fill the gap between scripts working on the command line with small test data samples, and real Galaxy tools. It introduces a Galaxy tool implementing a form driven code generator based on Galaxyxml and using Planemo for tests. This allows Galaxy to function as a clunky but useable IDE for simple tools. This may appeal to some developers, and allow them to quickly start producing simple, useful new tools, while they absorb the details required to make them manually.

Scripts must be debugged with test data on a command line. There's little point in running it without a working script (well, see the tutorial..) and test data.

Scripts developed in interactive environments are one important and obvious use case. Another is when a new discipline starts building tools. It is suitable for programmers and informaticians new to Galaxy. It is not designed for experienced or full time Galaxy developers who will be using planemo on the command line already and do not need a code generator.

The reader is warned to check if the tutorial might be useful for their situation; and that scripting skills are essential to make use of the ToolFactory and thus of the tutorial itself. No effort is made to teach the basics of tools or of command line scripting - this is a dev tutorial, but for a small market segment - including script writing researchers who don't describe or think of themselves as developers.

fubar2 commented 3 years ago

There are a couple - do all ? is check-links-gh-pages for example in need of fixin' ?

hexylena commented 3 years ago

Yes, do all! This is looking good, I'll have a go through maybe tomorrow for formatting/GTN stylistic stuff if that's OK with you, but then we can get this merged hopefully.

fubar2 commented 3 years ago

ooooh. really? did you really test it :)

fubar2 commented 3 years ago

ok - Makefile html linter file exclude updated in 3 seemingly relevant locations - including that branchy make one.

fubar2 commented 3 years ago

Suggestions on formatting/style/layout/content/... will all be greatly appreciated. As would trials where n > 1 of independent replications of the scripts running right. I am risk averse so terrified at the thought of dozens of deep and stupid bugs becoming obvious within hours of dozens of students trying it. Please point any new-to-galaxy script competent potential users toward that temporary URL if you come across any - some real student testing before wide release would be very helpful - I have no clue how it will work in practice with people of different backgrounds.

hexylena commented 3 years ago

@fubar2 going through it now. One question: you've got 4 options to try it out, how would you feel if I said "we recommend planemo" and ditch the other three that require additional setup? We should be teaching folks to use planemo anyway. I'd be fine if it was "run this container from docker/quay.io" too.

Any chance this will someday get upstreamed into the main branch of planemo?

Edit 1: Found quay.io/fubar2/toolfactory-galaxy-docker but it's slow as molasses, and doesn't seem to be migrating the db.

Edit 2: I'm currently running this. If it works, I'm going to just make the instructions use that, it's easier. (And if you want people to use it, it needs to be simple.)

git clone --recursive https://github.com/fubar2/planemo.git
pip install ./planemo
planemo tool_factory --port 9090 --host 0.0.0.0 --install_galaxy --conda_auto_install
hexylena commented 3 years ago

Would you be open to building the docker image a bit differently? Right now you've got this very custom image that doesn't leverage the huge amount of pre-existing work that could make your life so much easier. You could do this:

FROM bgruening/galaxy-stable:20.05
ENV GALAXY_CONFIG_BRAND ToolFactory
ADD tf.yaml $GALAXY_ROOT/tools.yaml
RUN install-tools $GALAXY_ROOT/tools.yaml

where tf.yaml probably would look like

api_key: fakekey
galaxy_instance: http://localhost:8080
install_resolver_dependencies: True
install_tool_dependencies: False
tools:
- name: tool_factory_2
  owner: fubar2
  tool_panel_section_label: "ToolFactory"

(It wouldn't be as devoid of other tools as the planemo version currently is, but, that could be nice?)

fubar2 commented 3 years ago

Hi @hexylena - this is exactly what I need - thanks! Working but retired and so in isolation is weird. Feedback is welcomed.

how would you feel if I said "we recommend planemo" and ditch the other three that require additional setup? We should be teaching folks to use planemo anyway. I'd be fine if it was "run this container from docker/quay.io" too. .... Any chance this will someday get upstreamed into the main branch of planemo?

Agreed - it is too confusing as it is. Planemo is the best way to run it but my PR of 19 December has not garnered any response...

Edit 1: Found quay.io/fubar2/toolfactory-galaxy-docker but it's slow as molasses, and doesn't seem to be migrating the db.

There seems to be some race condition in early startup - fails irregularly - normally a restart works. When it does work it's really really handy. However, it is a train wreck in lots of ways so should probably be abandoned in current form and if redone, done properly - so although docker is tempting, no docker for now.

There's something fishy about a tool running planemo in docker. That works fine outside a container but in Docker, the same code will result in a broken tool environment after the first Planemo run - a gxformat2 version mismatch is reported. Only way I could work around it was to create a biodocker style container running planemo. I think the time has come to abandon that ship and leave the wierd broken corner case when planemo is called by a Galaxy tool - not many tools are going to do that so there's probably little interest in repair.

Edit 2: I'm currently running this. If it works, I'm going to just make the instructions use that, it's easier. (And if you want people to use it, it needs to be simple.)

git clone --recursive https://github.com/fubar2/planemo.git
pip install ./planemo
planemo tool_factory --port 9090 --host 0.0.0.0 --install_galaxy --conda_auto_install

Ok - simpler and working is better. I'll start ripping out most of the installation section so there's just one option. Pity about docker for the moment but it's beyond my limited fixin' capacities.

Going with DIY planemo install, it would be best to have a local disposable private toolshed during any organised tutorials so students can try the round-trip of installing the new tool back into the planemo galaxy after generation - that is the main reason I like working with the Docker container but given api keys and URLs it should work fine with an external toolshed in planemo.

Thanks again - an external perspective is very helpful.

fubar2 commented 3 years ago

Given that this is a dev tutorial, just installing the TF from the toolshed works if you have a throwaway non-docker Galaxy handy. Won't most of the target audience have one? If so, it's efficient.

I agree on the "we want them to learn to run planemo" - so the TF inside planemo is a quick route. OTOH the normal tool install then load the history is a good option to have and probably not too confusing.

I think I'll remove all mention of the docker options to an appendix in case anyone cares?

fubar2 commented 3 years ago

oh dear. Something is very wrong with planemo at this moment. I updated from upstream and now I see invalid option: --dev-wheels and the help for run_tests.sh which seems odd...

Testing using galaxy_root /tmp/tmp__jmwdkc/galaxy-dev
Testing tools with command [cd /tmp/tmp__jmwdkc/galaxy-dev && export GALAXY_VIRTUAL_ENV && if [ ! -e "$GALAXY_VIRTUAL_ENV" ]; then /home/ross/miniconda3/envs/mulled-v1-85e726de078108f71e6bc87c13e6848c4ab1b0b2789316f70d3bb37c6b18b1aa/bin/virtualenv -p /home/ross/miniconda3/envs/mulled-v1-85e726de078108f71e6bc87c13e6848c4ab1b0b2789316f70d3bb37c6b18b1aa/bin/python3 $GALAXY_VIRTUAL_ENV; echo "Created virtualenv"; fi && if [ -e "$GALAXY_VIRTUAL_ENV" ]; then . "$GALAXY_VIRTUAL_ENV"/bin/activate; echo "Activated a virtualenv for Galaxy"; echo "$VIRTUAL_ENV"; else echo "Failed to activate virtualenv."; fi && ./run_tests.sh $COMMON_STARTUP_ARGS --report_file /tmp/tmp__jmwdkc/job_working_directory/000/41/working/TF_run_report_tempdir/tacrev_planemo_test_report.html --xunit_report_file /tmp/tmp__jmwdkc/tmp/tmpxokc2u9x/xunit.xml --structured_data_report_file /tmp/tmp__jmwdkc/job_working_directory/000/41/working/tfout/tool_test_output.json functional.test_toolbox]
galaxy.util.commands WARNING: Passing program arguments as a string may be a security hazard if combined with untrusted input
Activated a virtualenv for Galaxy
/home/ross/.planemo/gx_venv_3
invalid option: --dev-wheels
'run_tests.sh -id bbb'                  for testing one tool with id 'bbb' ('bbb' is the tool id)
'run_tests.sh -sid ccc'                 for testing one section with sid 'ccc' ('ccc' is the string after 'section::')
'run_tests.sh -list'                    for listing all the tool ids
'run_tests.sh -api (test_path)'         for running all the test scripts in the ./lib/galaxy_test/api directory, test_path
                                    can be pytest selector
'run_tests.sh -integration (test_path)' for running all integration test scripts in the ./test/integration directory, test_path
                                    can be pytest selector
'run_tests.sh -toolshed (test_path)'    for running all the test scripts in the ./lib/tool_shed/test directory
'run_tests.sh -installed'               for running tests of Tool Shed installed tools
'run_tests.sh -main'                    for running tests of tools shipped with Galaxy
'run_tests.sh -framework'               for running through example tool tests testing framework features in test/functional/tools"
'run_tests.sh -framework -id toolid'    for testing one framework tool (in test/functional/tools/) with id 'toolid'
'run_tests.sh -data_managers -id data_manager_id'    for testing one Data Manager with id 'data_manager_id'
'run_tests.sh -unit'                    for running all unit tests (doctests and tests in test/unit)
'run_tests.sh -unit (test_path)'        for running unit tests on specified test path (use nosetest path)
'run_tests.sh -selenium'                for running all selenium web tests (in lib/galaxy_test/selenium)
'run_tests.sh -selenium (test_path)'    for running specified selenium web tests (use nosetest path)

This wrapper script largely serves as a point documentation and convenience -
most tests shipped with Galaxy can be run with nosetests/pytest/yarn directly.
fubar2 commented 3 years ago

This is very odd. ToolFactory works fine installed in a normal Galaxy but I cannot get it to run in planemo - even after downgrading to 0.74.2 (testing using serve rather than tool_factory) and I know that used to work. The run_tests.sh complaint about --dev-wheels is odd. For the moment, that is the only way I can get it to work so I'll edit the others out until I get this figured out.

hexylena commented 3 years ago

Given that this is a dev tutorial, just installing the TF from the toolshed works if you have a throwaway non-docker Galaxy handy. Won't most of the target audience have one? If so, it's efficient.

It raises the barrier to entry, and if you want people to use it, I think making it a bit error-proof by not assuming this would make it easier for folks to get started? If it's just "run this command" and you're good to go, it's optimal.

btw, no need to close/reopen every time! Just leave it open and keep working :)

I think I'll remove all mention of the docker options to an appendix in case anyone cares?

an appendix sounds great!

fubar2 commented 3 years ago

It's late and this has been a strange few hours trying to figure out what is going on! I'll start again tomorrow. Even older versions of planemo just don't seem to do what they used to or something very odd has broken on my desktop....

hexylena commented 3 years ago

yeah, no worries! take the time you need to get it into shape.

fubar2 commented 3 years ago

I've patched my fork to remove "--dev-wheels" from COMMON_STARTUP_ARGS in planemo's call to run_tests.sh at the start of the test as: && ./run_tests.sh $COMMON_STARTUP_ARGS ... and everything seems to work now.

Looking back through the commits on galaxy code it looks like some recent surgery on run_tests.sh has had some unintended consequences for planemo running inside planemo - but no effect on calling planemo directly from the CL

My head hurts but things seem to work, so I'll replace those other methods - they still need review and cleanup as @hexylena has started doing...

fubar2 commented 3 years ago

@hexylena

I'll defer to others on these but for the record:

  1. Planemo model in the tutorial to encourage new devs to use it is good - however, it's not persistent. All work is lost unless exported. I can see it's arguably a good thing for a tutorial - delete the folder, all gone. In actual use, the TF installed directly in a local dev galaxy is far nicer because of persistence. It's a major bonus if you actually use it for something. With a local toolshed it's also a lot better because of the optional round trip straight into the tool menu. Tutorial on setting up a quick dev galaxy for onboarding devs?

  2. It may be possible to do it much more simply but one reason to prefer the more complex script on offer is that planemo will rebuild the Galaxy client every time you run planemo on the command line after installing with the simpler script - I tested with a venv added. Running the last line of the complex one restarts a clean planemo but without the frustrating delay of client building. Command line graphics get old fast IMHO. OTOH, it's a tutorial so should be entirely transient might be a counter-argument.

    Edit 2: I'm currently running this. If it works, I'm going to just make the instructions use that, it's easier. (And if you want people to use it, it needs to be simple.)

    git clone --recursive https://github.com/fubar2/planemo.git
    pip install ./planemo
    planemo tool_factory --port 9090 --host 0.0.0.0 --install_galaxy --conda_auto_install
fubar2 commented 3 years ago

Planemo issue raised https://github.com/galaxyproject/planemo/issues/1148 The workaround works - a sed script to add and ignore a parameter --dev-wheels to run_tests.sh.