galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.33k stars 966 forks source link

Workflow topology errors are not exposed beyond the logs #18168

Closed pcm32 closed 2 weeks ago

pcm32 commented 2 weeks ago

Describe the bug

When running workflows with collection operations sometimes you might introduce errors in the filtering or other collections logic that ends up with improper combinations of collections lengths being fed to a tool (imagine a tool receiving n datasets for plotting, but m labels for the plot title). The UI (at least in the versions of Galaxy I use, might be some behind), only gives you an invocation error for the workflow, but not more details, and you usually need to go and discover this at the logs:

galaxy.workflow.run ERROR 2024-05-16 15:47:47,850 [pN:workflow_scheduler0,p:8,tN:WorkflowRequestMonitor.monitor_thread] Failed to execute scheduled workflow.
Traceback (most recent call last):
  File "/galaxy/server/lib/galaxy/workflow/run.py", line 42, in __invoke
    outputs = invoker.invoke()
  File "/galaxy/server/lib/galaxy/workflow/run.py", line 163, in invoke
    incomplete_or_none = self._invoke_step(workflow_invocation_step)
  File "/galaxy/server/lib/galaxy/workflow/run.py", line 232, in _invoke_step
    incomplete_or_none = invocation_step.workflow_step.module.execute(
  File "/galaxy/server/lib/galaxy/workflow/modules.py", line 1884, in execute
    collection_info = self.compute_collection_info(progress, step, all_inputs)
  File "/galaxy/server/lib/galaxy/workflow/modules.py", line 367, in compute_collection_info
    collection_info = self.trans.app.dataset_collection_manager.match_collections(collections_to_match)
  File "/galaxy/server/lib/galaxy/managers/collections.py", line 648, in match_collections
    return MatchingCollections.for_collections(collections_to_match, self.collection_type_descriptions)
  File "/galaxy/server/lib/galaxy/model/dataset_collections/matching.py", line 103, in for_collections
    matching_collections.__attempt_add_to_linked_match(
  File "/galaxy/server/lib/galaxy/model/dataset_collections/matching.py", line 58, in __attempt_add_to_linked_match
    raise exceptions.MessageException(CANNOT_MATCH_ERROR_MESSAGE)
galaxy.exceptions.MessageException: Cannot match collection types.

which is of course beyond what most users can do. It would be great if the workflow invocation error could be more specific so that the user can recover from this error by a change of parameters or in the design of the workflow.

Galaxy Version and/or server at which you observed the bug Galaxy Version: 22.05 Commit:

Browser and Operating System Operating System: Windows, Linux, macOS Browser: Firefox, Chrome, Chrome-based, Safari

To Reproduce Steps to reproduce the behavior:

I'll try to provide a workflow, but basically it should be a tool receiving in two different inputs collections of different sizes.

Expected behavior

A more descriptive error on the workflow invocation UI element, ideally pointing to the step or tool in the workflow (with its label is available) where the issue is presenting.

mvdbeek commented 2 weeks ago

We've done that some time ago, together with the introduction of conditional workflow steps in 23.0 I think ?