catalyst / moodle-tool_dataflows

A generic workflow and processing engine which can be configured to do a large variety of tasks.
GNU General Public License v3.0
13 stars 8 forks source link

flow step filter based on an expression aka grep #18

Open brendanheywood opened 2 years ago

brendanheywood commented 2 years ago

eg any series of expression which results in a booleon to decide if each row should be kept or not

eg lets filter out any where user.suspended = 1

brendanheywood commented 2 years ago

I've just realized this step is technically redundant because anything you could do with a single filtering expression you can do with the case step with multiple filtering expressions. So what does everyone reckon?:

1) do we still do this to make it more obvious to flow authors when choosing a step 2) or ditch it and treat this as documentation task in the case step, and also tweak case so that it is valid to only have a single output

I'm leaning towards just improving case step

keevan commented 2 years ago

I prefer having a separate filter step:

It's similar reasoning to why you would split up large unwieldy methods into smaller portions, so they can be more easily understood and other parts reused for different purposes. I also feel some urge to change the terminology of "case step" to be a switch or branch step instead.

The alternative of course, is to pack only the functionality that dataflows need, because everything else can be done from core steps. This would fall into the territory of YAGNI and kind of where Golang stems from, in that the core language features are only those that are needed. Which is great for keeping things simple, but you end up writing a lot of the same thing to do more complex things, more expressive things. The cost being the tech debt to maintain the expressive alternatives involved.

Maybe later down the track - ideal, more polished product - I would like to see some sort of filter for all step types, that return steps relevant to the phrase or keyword chosen, taking into account the place the user wants the step to be added, which could be initially implemented simply using a list of defined tags for a step type (switch, case, filter, s3, copy) which each step author should use properly and to highlight relevance to use case.

brendanheywood commented 2 years ago

+1 for rename to switch https://github.com/catalyst/moodle-tool_dataflows/issues/399

I think from the authors perspective the discoverability is the main thing. If they can see a step called either grep or filter it's pretty obvious. So agree lets make it two steps