zorbash / opus

A framework for pluggable business logic components
MIT License
360 stars 21 forks source link

Parameterize steps #21

Closed saverio-kantox closed 4 years ago

saverio-kantox commented 4 years ago

Many times, the general behavior of a step is quite repetitive, so it would be nice to be able to reuse a single function of arity >1, and pass the extra arg(s) to the step itself.

If you think this could be interesting, I can provide a pull request.

This is an example:

defmodule Mapper do
  @moduledoc """
  Converts a flat structure from CSV into deeply nested fields of MyStruct.
  Empty fields are skipped.
  """

  use Opus.Pipeline

  step :parse
  step :nest, with: &%{source: &1}
  step :create_output, with: &Map.put(&1, :output, %MyStruct{})
  step :copy, args: ["some_col", [:some, :nested, :path]]
  step :copy, args: ["some_col_2", [:other, :nested, :path]]
  step :copy, args: ["some_col_3", [:again, :nested, :path]]
  # possibly many more lines like this
  step :extract_output, with: &get_in(&1, [:output])

  def parse(text) do
    [headers | rows] = NimbleCSV.RFC4180.parse_string(data, skip_headers: false)
    Enum.map(rows, fn row -> headers |> Enum.zip(row) |> Map.new() end)
  end

  def copy(input, [from, to]) do
    case get_in(input, [:source | List.wrap(from)]) do
      nil -> input
      "" -> input
      value -> put_in(input, [:output | List.wrap(to)], value)
    end
  end
end
zorbash commented 4 years ago

I'm a bit sceptic about this since most stage functions will have arity of 1, but for this use-case which I'm not sure how common might be yet, we'll have some functions with arity > 1.

For this particular example, given that "some_col", "some_col_2" and "some_col_3" are columns of some CSV file and have meaningful names, I'd prefer to have steps names to include that information. This way a reader can infer the functionality of the step by its name without reading the implementation.

defmodule Mapper do
  use Opus.Pipeline

  alias __MODULE__, as: Self

  step :parse
  step :nest, with: &%{source: &1}
  step :create_output, with: &Map.put(&1, :output, %MyStruct{})
  step :copy_address, with: &(Self.copy(&1, ["address", [:some, :nested, :path]])) 
  step :copy_first_name, with: &(Self.copy(&1, ["first_name", [:some, :nested, :path]])) 
  step :copy_last_name, with: &(Self.copy(&1, ["last_name", [:some, :nested, :path]])) 
  # possibly many more lines like this
  step :extract_output, with: &get_in(&1, [:output])

  def copy(input, [from, to]) do
    case get_in(input, [:source | List.wrap(from)]) do
      nil -> input
      "" -> input
      value -> put_in(input, [:output | List.wrap(to)], value)
    end
  end
end

Keep in mind that if you want / have to use the same name for all the copy steps you can:

defmodule Mapper do
  use Opus.Pipeline

  alias __MODULE__, as: Self

  step :parse
  step :nest, with: &%{source: &1}
  step :create_output, with: &Map.put(&1, :output, %MyStruct{})
  step :copy, with: &(Self.copy(&1, ["address", [:some, :nested, :path]])) 
  step :copy, with: &(Self.copy(&1, ["first_name", [:some, :nested, :path]])) 
  step :copy, with: &(Self.copy(&1, ["last_name", [:some, :nested, :path]])) 
  # possibly many more lines like this
  step :extract_output, with: &get_in(&1, [:output])

  def copy(input, [from, to]) do
    case get_in(input, [:source | List.wrap(from)]) do
      nil -> input
      "" -> input
      value -> put_in(input, [:output | List.wrap(to)], value)
    end
  end
end
zorbash commented 4 years ago

Closing due to inactivity.