galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.41k stars 1.01k forks source link

Create a warning if users will create > N jobs with a workflow execution #12051

Open bwlang opened 3 years ago

bwlang commented 3 years ago

It's possible to create huge numbers of jobs when using multiple collections as inputs. Users should probably be warned if they will trigger creation of more than some number of jobs (e.g. 100000)

mvdbeek commented 3 years ago

I'm afraid that's probably very difficult to implement for workflows, where this could happen at any step. One could insert a pause and review step in that case. For tool executions that might be more feasible.

bwlang commented 3 years ago

yeah maybe hard to predict... Perhaps it could be done just prior to submission via a server-side call with all the parameters that would actually execute the workflow? The whole workflow instantiation would be too expensive, but maybe significant savings are possible if one does not actually have to read all the tool xml, etc - just figure out how many jobs will be needed.

mvdbeek commented 3 years ago

That would only be a partial solution, the number of jobs may depend on a value computed in a step.

bwlang commented 3 years ago

Hmm - yeah there are a few tools that produce variable output numbers. I guess one could say "execution would produce at least N jobs... are you sure". Maybe it's not worth the effort unless big sites are also running into these unintentional compute bombs :)