elixir-toniq / norm

Data specification and generation
MIT License
689 stars 29 forks source link

maybe() and regex helper functions #30

Open asianfilm opened 4 years ago

asianfilm commented 4 years ago

In case they help others, here are a couple of "helper" functions that I've found useful.


  1. maybe()

I often want specs or schemas that also accept a nil value. Rather than littering my specs with spec(is_nil() or (...)), I have a maybe() helper function:

def maybe(spec), do: one_of([spec(is_nil()), spec])

Example:

def order(), do: spec(is_integer() and (&(&1 >= 0)))

@contract get_order() :: maybe(order())
def get_order(), do: Enum.random([nil, 0, 1, 2, 3])

Perhaps it's something that could be included in the library as a bit of syntactic sugar.


  1. regex()

When using Regex.match? in specs, it's important to also check for is_binary(). Otherwise, when you send, say, a nil value you'll get a no function clause matching in Regex.match?/2 error with a stacktrace that only points to Norm's own code. (It will still crash when wrapped in my maybe() helper function.)

So that I don't forget any is_binary() clauses, I use my own regex() helper function(s).

def flip(p2, p1, func), do: apply(func, [p1, p2])
def regex(regex), do: spec(is_binary() and flip(regex, &Regex.match?/2))

Example:

def date(), do: regex(~r/^20\d\d-\d\d-\d\d$/)

Perhaps match() or match?() would be better naming than regex().

keathley commented 4 years ago

I'm considering building something like norm_contrib (which is probably a terrible name) that includes common helpers for things like urls, uuids, etc. I think something like maybe and result might make the most sense in a library like that. I do think it probably makes sense to have first class regex support in Norm. I've actually considered supporting this: spec(~r/foo/). I'm not sure if that's the right approach over using a dedicated function.

wojtekmach commented 4 years ago

Built-in support for regexs sounds very useful to me. Something Ive been thinking recently is similiar support for ranges. With these two we could:

@contract rgb2hex(spec({0..255, 0..255, 0..255}) :: spec(~r/#[0-9A-F]{6}/)
def rgb2hex({r, g, b}) do

and this is pretty appealing to me. (but Id still probably extract each spec into separate function)

or maybe the solution is to use the Conformable protocol for these?

keathley commented 4 years ago

I like the idea of using conformable on ranges! We could probably support generation that way as well. Regular expressions aren't a struct are they? Because that would also work in this scenario. Although generation would probably be a mess.

wojtekmach commented 4 years ago

Regexes are in fact structs, thats the beautiful part :D

keathley commented 4 years ago

Perfect :)

asianfilm commented 4 years ago

The benefit of a norm_contrib (yeah, needs a better name) is going to come from having custom generators for common patterns. An example "plugin" for membership specs and generators:

defmodule Membership do
  # @behaviour NormPlugin
  use Norm

  def spec(members), do: Norm.spec(&(&1 in members))

  def take(members, count) do
    fn_reverse = fn {a, b} -> {b, a} end
    fn_convert = &rem(&1, length(members))

    lookup = members |> Enum.with_index() |> Enum.map(fn_reverse) |> Map.new()
    spec = Norm.spec(is_integer() and (&(&1 > 0)))

    spec |> gen() |> Enum.take(count) |> Enum.map(&Map.get(lookup, fn_convert.(&1)))
  end
end 

Even without a proper plugin system, a directory of user-contributed "plugins", or rather helper modules, would be useful immediately.

For example, to use the above as is:

iex> days_of_week = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]

iex> days_spec = Membership.spec(days_of_week)
#Norm.Spec<&(&1 in members)>

iex> days_of_week |> Enum.random() |> conform!(days_spec)
"Tue"

iex> days_of_week |> Membership.take(9)
["Mon", "Mon", "Wed", "Thu", "Fri", "Thu", "Mon", "Mon", "Sat"]
asianfilm commented 4 years ago

I was playing around with a more natural behavior for a plugin:

defmodule Membership do
  # @behaviour NormPlugin
  use Norm

  def spec(members), do: Norm.spec(&(&1 in members))

  def gen(members) do
    fn_reverse = fn {a, b} -> {b, a} end
    lookup = members |> Enum.with_index() |> Enum.map(fn_reverse) |> Map.new()
    int_spec = Norm.spec(is_integer())
    with_gen = with_gen(int_spec), StreamData.integer(1..length(members)))

    Stream.map(Norm.gen(with_gen), &Map.get(lookup, &1))
  end
end

And then had the revelation that my take/2 above is an over-engineered variation on Enum.random/1! And this isn't much better.

Ideally, you'd want to define a with_gen/2 function in any plugin, but the built-in Norm.gen/1 won't have enough information to work with it.

And the bigger issue, and why perhaps this has to be integrated into Norm, is when generating data from (nested) schemas.

I'll resist the urge to re-write the above as a GenServer that remembers its members as state...

keathley commented 4 years ago

For your example I think we could get away with something like this

defmodule Membership do                                                                                    
  def spec(members) do                                                                                     
    s = Norm.spec(& &1 in members)                                                                         
    g =                                                                                                    
      members                                                                                              
      |> Enum.map(&StreamData.constant/1)                                                                  
      |> StreamData.one_of()                                                                               

    Norm.with_gen(s, g)                                                                                    
  end                                                                                                      
end                                                                                                        

s = Membership.spec([1,2,3])                                                                                                             
values = s |> Norm.gen()|> Enum.take(5)

for i <- values do                                                                                         
  assert valid?(i, s)                                                                                      
end
keathley commented 4 years ago

I'd have to think more about nested schemas. I think they could follow a similar pattern but I'd have to play around with it more.

asianfilm commented 4 years ago

I like your code.

But I'm not sure how to use conform or contracts without sending the list of members each time.

An imperfect solution, if only because of the module population explosion:

defmodule Membership do
  defmacro __using__(_) do
    quote do
      use Norm

      def s(), do: Norm.spec(&(&1 in __MODULE__.members()))

      def gen() do
        Norm.with_gen(
          s(),
          __MODULE__.members()
          |> Enum.map(&StreamData.constant/1)
          |> StreamData.one_of()
        )
        |> gen()
      end
    end
  end
end

defmodule Days do
  use Membership

  def members(), do: ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
end

defmodule Calendar do
  use Norm

  @contract favorite_day() :: Days.s()
  def favorite_day(), do: "Fri"
end

And with gen, take and conform:

iex> Days.gen() |> Enum.take(5) |> conform(coll_of(Days.s())) 
{:ok, ["Tue", "Thu", "Thu", "Tue", "Wed"]}
asianfilm commented 4 years ago

And example code for a schema with the same behavior/interface:

defmodule Todo do
  use Norm

  defstruct [:what, :when, :who]

  def s(),
    do:
      schema(%{
        what: spec(is_binary() and (&(String.length(&1) in 1..20))),
        when: Days.s(),
        who: coll_of(Team.s())
      })

  def gen(),
    do:
      Stream.repeatedly(fn ->
        %__MODULE__{
          what: Enum.random(["Wash dishes", "Grocery shop", "Watch movie", "Read book"]),
          when: Days.gen() |> Enum.take(1) |> List.first(),
          who: Team.gen() |> Enum.take(Enum.random(1..3)) |> MapSet.new()
        }
      end)
end

defmodule Membership do
  defmacro __using__(_) do
    quote do
      use Norm

      def s(), do: Norm.spec(&(&1 in __MODULE__.members()))

      def gen(),
        do:
          Norm.with_gen(
            s(),
            __MODULE__.members()
            |> Enum.map(&StreamData.constant/1)
            |> StreamData.one_of()
          )
          |> Norm.gen()
    end
  end
end

defmodule Days do
  use Membership

  def members(), do: ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
end

defmodule Team do
  use Membership

  def members(), do: ["Chris", "Stephen", "Wojtek"]
end

In use:

iex> Todo.gen() |> Enum.take(3) |> conform(coll_of(Todo.s()))
{:ok,
  [
    %Todo{what: "Wash dishes", when: "Thu", who: #MapSet<["Chris", "Stephen"]>},
    %Todo{what: "Watch movie", when: "Sat", who: #MapSet<["Wojtek"]>},
    %Todo{what: "Read book", when: "Fri", who: #MapSet<["Chris"]>}
  ]
}
asianfilm commented 4 years ago

And for nested schemas:

defmodule Todo do
  use Norm

  defstruct [:what, :when, :who]

  def s(),
    do:
      schema(%{
        what: spec(is_binary() and (&(String.length(&1) in 1..20))),
        when: Days.s(),
        who: coll_of(Person.s())
      })

  def gen(),
    do:
      Stream.repeatedly(fn ->
        %__MODULE__{
          what: Enum.random(["Wash dishes", "Grocery shop", "Watch movie", "Read book"]),
          when: Days.gen() |> Enum.take(1) |> List.first(),
          who: Person.gen() |> Enum.take(Enum.random(1..3)) |> MapSet.new()
        }
      end)
end

defmodule Person do
  use Norm

  defstruct [:name, :country]

  def s(),
    do:
      schema(%{
        name: spec(is_binary() and (&(String.length(&1) in 1..20))),
        country: Country.s()
      })

  def gen(),
    do:
      Stream.repeatedly(fn ->
        %__MODULE__{
          name: Enum.random(["Chris", "Stephen", "Wojtek"]),
          country: Country.gen() |> Enum.take(1) |> List.first()
        }
      end)
end

defmodule Membership do
  defmacro __using__(_) do
    quote do
      use Norm

      def s(), do: Norm.spec(&(&1 in __MODULE__.members()))

      def gen(),
        do:
          Norm.with_gen(
            s(),
            __MODULE__.members()
            |> Enum.map(&StreamData.constant/1)
            |> StreamData.one_of()
          )
          |> Norm.gen()
    end
  end
end

defmodule Days do
  use Membership

  def members(), do: ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
end

defmodule Country do
  use Membership

  def members(), do: ["Germany", "Italy", "Poland", "Philippines", "U.S.A."]
end

In use:

iex> Todo.gen()  |> Enum.take(2)
[
  %Todo{
    what: "Wash dishes",
    when: "Sat",
    who: #MapSet<[
      %Person{country: "France", name: "Wojtek"}
    ]>
  },
  %Todo{
    what: "Grocery shop",
    when: "Tue",
    who: #MapSet<[
      %Person{country: "Poland", name: "Wojtek"},
      %Person{country: "U.S.A.", name: "Chris"}
    ]>
  }
]
asianfilm commented 4 years ago

And just to prove (to myself) that it (and Norm) works with algebraic data types with Algae:

defmodule Todo do
  use Norm

  defstruct [:what, :when, :who]

  def s(),
    do:
      schema(%{
        what: spec(is_binary() and (&(String.length(&1) in 1..20))),
        when: Days.s(),
        who: coll_of(Person.s())
      })

  def gen(),
    do:
      Stream.repeatedly(fn ->
        %__MODULE__{
          what: Enum.random(["Wash dishes", "Grocery shop", "Watch movie", "Read book"]),
          when: Days.gen() |> Enum.take(1) |> List.first(),
          who: Person.gen() |> Enum.take(Enum.random(1..3)) |> MapSet.new()
        }
      end)
end

defmodule Person do
  use Norm

  import Algae

  alias Algae.Maybe

  defsum do
    defdata Student do
      name :: String.t()
      school :: String.t()
    end

    defdata Programmer do
      name :: String.t()
      languages :: MapSet.t()
      university :: Maybe.Just.t() | Maybe.Nothing.t()
    end
  end

  def s(),
    do:
      schema(%{
        name: spec(is_binary() and (&(String.length(&1) in 1..20))),
        languages: coll_of(Language.s()),
        school: School.s(),
        university: spec(&(Maybe.from_maybe(&1, else: nil) in University.members()))
      })

  def gen(),
    do:
      Stream.repeatedly(fn ->
        Enum.random([
          %Person.Student{
            name: Enum.random(["Sabrina", "Harvey", "Prudence"]),
            school: School.gen() |> Enum.take(1) |> List.first()
          },
          %Person.Programmer{
            name: Enum.random(["Chris", "Stephen", "Wojtek"]),
            languages: Language.gen() |> Enum.take(Enum.random(1..3)) |> MapSet.new(),
            university: University.gen() |> Enum.take(1) |> List.first() |> Maybe.from_nillable()
          }
        ])
      end)
end

defmodule Membership do
  defmacro __using__(_) do
    quote do
      use Norm

      def s(), do: Norm.spec(&(&1 in __MODULE__.members()))

      def gen(),
        do:
          Norm.with_gen(
            s(),
            __MODULE__.members()
            |> Enum.map(&StreamData.constant/1)
            |> StreamData.one_of()
          )
          |> Norm.gen()
    end
  end
end

defmodule Days do
  use Membership

  def members(), do: ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
end

defmodule Language do
  use Membership

  def members(), do: ["Elixir", "Haskell", "Cobol", "Elm"]
end

defmodule School do
  use Membership

  def members(), do: ["Academy of the Unseen Arts", "Baxter High School"]
end

defmodule University do
  use Membership

  def members(), do: [nil, "UMIST", "University of Southern California"]
end

In use:

iex> Todo.gen() |> Enum.take(2) |> IO.inspect() |> conform(coll_of(Todo.s()))
[
  %Todo{
    what: "Watch movie",
    when: "Thu",
    who: #MapSet<[
      %Person.Programmer{
        languages: #MapSet<["Elixir", "Elm"]>,
        name: "Stephen",
        university: %Algae.Maybe.Just{just: "UMIST"}
      }
    ]>
  },
  %Todo{
    what: "Watch movie",
    when: "Sun",
    who: #MapSet<[
      %Person.Student{name: "Prudence", school: "Academy of the Unseen Arts"},
      %Person.Programmer{
        languages: #MapSet<["Elixir", "Elm", "Haskell"]>,
        name: "Chris",
        university: %Algae.Maybe.Nothing{}
      }
    ]>
  }
]

{:ok,
 [
   %Todo{
     what: "Watch movie",
     when: "Thu",
     who: [
       %Person.Programmer{
         languages: ["Elixir", "Elm"],
         name: "Stephen",
         university: %Algae.Maybe.Just{just: "UMIST"}
       }
     ]
   },
   %Todo{
     what: "Watch movie",
     when: "Sun",
     who: [
       %Person.Student{name: "Prudence", school: "Academy of the Unseen Arts"},
       %Person.Programmer{
         languages: ["Elixir", "Elm", "Haskell"],
         name: "Chris",
         university: %Algae.Maybe.Nothing{}
       }
     ]
   }
 ]}

PS: It seems that conform is converting the MapSets to Lists when schemas are nested.