haesleinhuepf / git-bob-playground

BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

[Programming question]: Galaxy workflows #123

Open haesleinhuepf opened 3 days ago

haesleinhuepf commented 3 days ago

Do you know what Galaxy workflows (.ga files) are? If I outline a rough image processing workflow, could you construct a galaxy workflow from this?

git-bob think about this

github-actions[bot] commented 3 days ago

This message was generated by git-bob (version: 0.14.1dev, model: claude-3-5-sonnet-20241022, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

Hi! 👋

I know Galaxy workflows are JSON-formatted workflow files used with the Galaxy platform - a web-based platform for computational biomedical research. However, I'm not sufficiently familiar with their specific structure and requirements to reliably construct a .ga workflow file.

@haesleinhuepf - Could you help with guidance on Galaxy workflow construction?

Best, git-bob

haesleinhuepf commented 3 days ago

Here is an example: https://github.com/rmassei/imaging_KNIME_to_Galaxy/blob/main/src/test_data/01_nuclei_segmentation/workflows/2024_nuclei_segmentation_galaxy.ga

Let's construct a new galaxy workflow with just two steps:

git-bob think about this

github-actions[bot] commented 3 days ago

This message was generated by git-bob (version: 0.14.1dev, model: claude-3-5-sonnet-20241022, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

Based on the example, here's a Galaxy workflow file implementing a simple Gaussian blur and thresholding pipeline:


{
    "a_galaxy_workflow": "true",
    "format-version": "0.1",
    "name": "gaussian_blur_threshold",
    "steps": {
        "0": {
            "id": 0,
            "type": "data_input",
            "inputs": [{"name": "Input Image"}],
            "position": {"left": 0, "top": 0}
        },
        "1": {
            "id": 1,
            "tool_id": "toolshed.g2.bx.psu.edu/repos/imgteam/2d_simple_filter/ip_filter_standard/1.12.0+galaxy1",
            "type": "tool",
            "input_connections": {
                "input": {"id": 0, "output_name": "output"}
            },
            "tool_state": "{\"filter\": {\"filter_type\": \"gaussian\", \"size\": \"3.0\"}}",
            "position": {"left": 300, "top": 0}
        },
        "2": {
            "id": 2, 
            "tool_id": "toolshed.g2.bx.psu.edu/repos/imgteam/2d_auto_threshold/ip_threshold/0.18.1+galaxy3",
            "type": "tool",
            "input_connections": {
                "input": {"id": 1, "output_name": "output"}
            },
            "tool_state": "{\"th_method\": {\"method_id\": \"otsu\"}}",
            "position": {"left": 600, "top": 0}
        }
    }
}
haesleinhuepf commented 3 days ago

git-bob comment on ways to automatically check if this workflow is formally correct. Is there a python lib or CLI tool which does that?

github-actions[bot] commented 3 days ago

This message was generated by git-bob (version: 0.14.1dev, model: claude-3-5-sonnet-20241022, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

For checking if a Galaxy workflow file is formally correct, there are a couple of options:

  1. gxformat2 Python package: It has tools for validating Galaxy workflow formats

    from gxformat2.yaml import validate_workflow
    validate_workflow("workflow.ga")
  2. Galaxy's command line interface planemo:

    planemo lint workflow.ga

Both tools can help catch syntax errors and validate the workflow structure.

Do you want me to demonstrate how to use any of these validation tools in more detail?

haesleinhuepf commented 3 days ago

Ok, let's write a Jupyter notebook that saves the worklow above in a .ga file and afterwards uses gxformat2 to validate the workflow.

git-bob implement this!

haesleinhuepf commented 3 days ago

Ok, this didn't work.

Let's try to use planemo instead. The correct syntax to use it for validating a workflow seems:

planemo workflow_lint path/to/workflow.ga

git-bob implement another notebook which writes the galaxy workflow above in a .ga file and validates it, this time using planemo CLI instead of gxformat2.