zoedsoupe / peri

Elixir library for declarative data description and validation
MIT License
87 stars 2 forks source link

Notes/requests on `:cond` and `:dependent` types #2

Closed morfert closed 4 months ago

morfert commented 4 months ago

Reference

Link to the implementations of :dependent and :cond. Copied below for convenience.

defp validate_field(val, {:cond, condition, true_type, else_type}, data) do
  if condition.(data) do
    validate_field(val, true_type, data)
  else
    validate_field(val, else_type, data)
  end
end

defp validate_field(val, {:dependent, field, condition, type}, data) do
  dependent_val = get_enumerable_value(data, field)

  with :ok <- condition.(val, dependent_val) do
    validate_field(val, type, data)
  end
end

Notes

condition function

The condition function is present for both types but is expected to return a boolean for :cond types but an :ok atom for :dependent types. This a bit inconsistent. The :cond type could be updated to the following as an example.

defp validate_field(val, {:cond, condition, true_type, else_type}, data) do
  with :ok <- condition.(data) do
    validate_field(val, true_type, data)
  else
    _ -> validate_field(val, else_type, data)
  end
end

:cond type condition callback seems strange

Forgive for misunderstanding but from looking at the source code it seems that data is the full data that the user passes in (probably a map) and val is the value of the specific field that is being validated. Assuming as such should'nt the condition function be given val as the argument and not data? The following is an update of my previous update example to implement this.

defp validate_field(val, {:cond, condition, true_type, else_type}, _data) do
  with :ok <- condition.(val) do
    validate_field(val, true_type, data)
  else
    _ -> validate_field(val, else_type, data)
  end
end

:dependent type is very restrictive

Not sure if :dependent should be updated or new type added but a type that succinctly allows multiple dependencies to be accessed and compared would be nice. Below is an example (I am updating :dependent type but could be a new type) of a type that I would like to see.

# Instead of `field`, `condition` and `type` their is now only a `callback`, this is similar to a `:custom` type
defp validate_field(val, {:dependent, callback}, data) do
  # Unlike `condtion` from before and even the `:custom` type the callback takes the full `data` not the field value, the callback can depend on multiple fields now
  evaluation = callback(data)

  # `callback` returns a tuple of `:ok` and `type` so the callback can generate a type based on the input data even
  with {:ok, type} <- evaluation do
    validate_field(val, type, data)
  end
end

The following is an example of this new :dependent type.

defschema :profile, %{
  height: {:required, integer},
  width: {:required, integer},
  square: {:dependent, &check_if_square/1}
}

defp check_if_square(%{ height: h, width: w }), do
  case h === w do
    true -> {:ok, {:required, :boolean}}
    false -> {:ok, :boolean}
  end
end

End

I hope I did not seem too critical, I have been looking at various validation libraries and none have all the features I desire. Even so Peri seems the closest since it has nested types, custom types and supports union types using the :oneof type. I have a need of a :dependent type that lets me read multiple fields from the data to ascertain what type the field should be. Ultimately, :custom type should probably also give the full data as the second argument so that the field can be validated using multiple fields. Good job and thank you for the library.

zoedsoupe commented 4 months ago
  1. condition function

i don't think so. the :cond exists to validate if a field value passes on a boolean condition, to apply type parsing in the true or else branch. let's see an example:

defmodule TaskSchemas do
  import Peri

  defschema :task, %{
    title: {:required, :string},
    status: {:required, :string},
    completion_date: {:cond, fn data -> data[:status] == "completed" end, :date, nil}
  }
end

the completion_date schema definition can be read as: "if the status field has the value of 'completed', then the field completion_date should be parsed as a :date, otherwise (else) it should be treated as nil".

does that make sense? now let's see the idea of the :dependent type:

defmodule UserSchemas do
  import Peri

  defschema :user_registration, %{
    username: {:required, :string},
    password: {:required, :string},
    password_confirmation: {:dependent, :password, &validate_confirmation/2, :string}
  }

  # if confirmation has the same value of password, the validation is ok
  defp validate_confirmation(password, password), do: :ok

  defp validate_confirmation(_confirmation, _password) do
    {:error, "confirmation should be equal to password", []}
  end
end

now you can read the password_confirmation schema definition as: "the password_confirmation field should be parsed as a string only if it passes the validation callback, which in this case the password and confirmation values should match each other".

  1. :cond type condition callback seems strange

yeah, following the idea of the "if then else" type parsing of the :cond, it should receive the actual value of the field and not necessarily the whole data. but this wouldn't restrict the :cond type? (genuine question)

  1. :dependent type is very restrictive

hmm, i do agree with this idea of getting the whole data for the :dependent callback. i think then that it's better to have a new version of :dependent that receives a 1 rarity callback and can return :ok or {:error, template, context} as it sibling {:dependent, field, condition, type} defined the callback. so your example should look like:

defschema :profile, %{
  height: {:required, integer},
  width: {:required, integer},
  square: {:dependent, &check_if_square/1}
}

defp check_if_square(%{ height: h, width: w }), do
  case h === w do
    true -> 
      :ok
    false -> 
      {:error, "width should be equal to height (%{height}, got: %{width}", width: w, height: h}
  end
end

what do you think about this idea?

morfert commented 4 months ago

Response

1. condition function

Well, in both :dependent and :cond the return from condition seems to behave only as a boolean so maybe dependent should return a boolean, only if its not updated/ altered.

2. :cond type condition callback seems strange

Yes, changing it to just val would be more restrictive and I do prefer receiving the full data. I just thought it was inconsistent considering the :dependent type implementation.

3. :dependent type is very restrictive

:cond based schema

defmodule CondSchema do
  import Peri

  defschema(:details, %{
    email: {:required, :string},
    country: {:required, :string}
  })

  defschema(:info, %{
    name: {:required, :string},
    provide_details: {:required, :boolean},
    details:
      {:cond,
       fn %{data: %{provide_details: pd}} ->
         pd
       end, get_schema(:details), nil}
  })
end
Test 1
CondSchema.info(%{name: "some_name", provide_details: false})
{:ok, %{name: "some_name", provide_details: false}}

As expected.

Test 2
CondSchema.info(%{name: "some_name", provide_details: true})
{:ok, %{name: "some_name", provide_details: true}}

The condition function would return true so details: should be of schema :details but is accepted even though details: is absent.

Test 3
CondSchema.info(%{name: "some_name", provide_details: false, details: nil})
{:ok, %{name: "some_name", details: nil, provide_details: false}}

As expected.

Test 4
CondSchema.info(%{name: "some_name", provide_details: true, details: nil})
{:ok, %{name: "some_name", details: nil, provide_details: true}}

Since provide_details: is true the details: should be of schema :details but nil is accepted.

Test 5
CondSchema.info(%{name: "some_name", provide_details: true, details: "cow"})
{:error,
 [
   %Peri.Error{
     path: [:details],
     key: :details,
     content: %{
       actual: "\"cow\"",
       expected: %{email: {:required, :string}, country: {:required, :string}}
     },
     message: "expected type of %{email: {:required, :string}, country: {:required, :string}} received \"cow\" value",
     errors: nil
   }
 ]}

As expected. This shows that the condition is only evaluated if details: is present and not nil.

Test 6
CondSchema.info(%{name: "some_name", provide_details: true, details: %{email: "some_email", country: "some_country"}})
{:ok,
 %{
   name: "some_name",
   details: %{email: "some_email", country: "some_country"},
   provide_details: true
 }}

As expected.

Test 7
CondSchema.info(%{name: "some_name", provide_details: true, details: %{email: "some_email", country: 5}})
{:error,
 [
   %Peri.Error{
     path: [:details],
     key: :details,
     content: nil,
     message: nil,
     errors: [
       %Peri.Error{
         path: [:details, :country],
         key: :country,
         content: %{actual: "5", expected: :string},
         message: "expected type of :string received 5 value",
         errors: nil
       }
     ]
   }
 ]}

As expected. We get the nested error.

Conclusion

Even though the condition function takes the full data as argument, meaning it can depend on anything, it is only evaluated when details: is present and not nil even though it depends on provide_details: which is required.

New hypothetical :ok only :dependent based schema

defmodule OkOnlyDependentSchema do
  import Peri

  defschema(:info, %{
    name: {:required, :string},
    provide_email: {:required, :boolean},
    provide_country: {:required, :boolean},
    details: {:dependent, &verify_details/1}
  })

  defp verify_details(%{data: data}) do
    %{provide_email: pe, provide_country: pc, details: details} = data

    provide = {pe, pc}
    error = {:error, "some_error", []}

    case provide do
      # Effective Schema
      # defschema(:details, %{email: {:required, :string}, country: {:required, :string}})
      {true, true} ->
        case details do
          %{email: email, country: country} when is_binary(email) and is_binary(country) -> :ok
          _ -> error
        end

      # Effective Schema
      # defschema(:details, %{email: {:required, :string}})
      {true, false} ->
        case details do
          %{email: email} when is_binary(email) -> :ok
          _ -> error
        end

      # Effective Schema
      # defschema(:details, %{country: {:required, :string}})
      {false, true} ->
        case details do
          %{country: country} when is_binary(country) -> :ok
          _ -> error
        end

      # Effective Schema
      # defschema(:details, nil)
      {false, false} ->
        case details do
          nil -> :ok
          _ -> error
        end
    end
  end
end
Case 1
OkOnlyDependentSchema.info(%{name: "some_name", provide_email: true, provide_country: false, details: %{email: 48}})

This will give an error with the path being [:details].

Conclusion

Using the proposed new :dependent type it would be even more powerful than the current :custom type but without changing the return errors type you cannot have errors with custom paths. In Case 1 above I would like the error path to be [:details, :email] which is where the error is.

New proposed type based :dependent schema

defmodule TypeDependentSchema do
  import Peri

  # Using the following implementation of dependent
  defp validate_field(val, {:dependent, callback}, data) do
    evaluation = callback.(data)

    with {:ok, type} <- evaluation do
      validate_field(val, type, data)
    end
  end

  defschema(:email_details, %{email: {:required, :string}})

  defschema(:country_details, %{country: {:required, :string}})

  defschema(:details, Map.merge(get_schema(:email_details), get_schema(:country_details)))

  defschema(:info, %{
    name: {:required, :string},
    provide_email: {:required, :boolean},
    provide_country: {:required, :boolean},
    details: {:dependent, &verify_details/1}
  })

  defp verify_details(%{data: data}) do
    %{provide_email: pe, provide_country: pc} = data

    provide = {pe, pc}

    case provide do
      {true, true} -> {:ok, get_schema(:details)}
      {true, false} -> {:ok, get_schema(:email_details)}
      {false, true} -> {:ok, get_schema(:country_details)}
      {false, false} -> {:ok, nil}
    end
  end
end
Case 1
TypeDependentSchema.info(%{name: "some_name", provide_email: true, provide_country: false, details: %{email: 48}})
Conclusion

Now the error path should be [:details, :email] which is what I want.

Conclusion

Ultimately, I would like to have arbitrary conditions that can rely on multiple properties from the input to determine validity along with the ability to have my errors on the properties with accurate paths. The OkOnlyDependentSchema implementation would mean losing the ability to have nested errors which is also lost in :custom types. This could be remedied by changing the return error type to have an option to define the path but the TypeDependentSchema implementation would allow concise reuse of defschema in my opinion. You could even return {:ok, {:custom, &callback/1}} to have custom validations.

Notes

data argument

I had incorrectly assumed the data argument was the raw input. Turns out it is the input wrapped in a %Peri.Parser{} struct.

End

Sorry for the long article. Thank you for the response.

zoedsoupe commented 4 months ago

1. condition function

ok i agree in general, however i don't see an use case where the value is updated on the condition function, as stated on "only if its not updated/ altered." as you've said.

2. :cond type condition callback seems strange

yeah, so i'll update the dependent type condition function to receive the whole data, however is important to note that the data received on that point of the execution is only the "data" available for that nest level. example:

defschema :user, %{
  profile: %{
    username: :string,
    email: {:dependent, :username, fn data -> is_nil(data.username) end, {:string, {:regex, ~r/@/}},
  }
  age: {:required, :integer}
}

i can see 2 possible problems here: as i know you cannot access the age field on the validation of the dependent data for profile nested structure. that occurs because the validate_schema/2 schema is recursive, so each nest level is process in a separate way.

also, i think the dependent type differs on the cond type as whereas the cond type you want to validate a boolean, in the dependent type (in some use cases), you would want to return a custom error message, which is not possible if the condition function returns boolean. maybe would be solved with the custom error message feature, but that's a future discussion.

3.1 :dependent type is very restrictive

i played with your schema definition for CondSchema, and i notice that as all fields are by default "optional" the cond type will act as optional unless you pass it inside the required type. maybe we should refactor the cond and dependent types to always be required by default? let's see, if we define the CondSchema as:

defmodule CondSchema do
  import Peri

  defschema(:details, %{
    email: {:required, :string},
    country: {:required, :string}
  })

  defschema(:info, %{
    name: {:required, :string},
    provide_details: {:required, :boolean},
    details:
      {:required, {:cond,
       fn %{data: %{provide_details: pd}} ->
         pd
       end, get_schema(:details), nil}}
  })
end

now if you try the "failed" test case:

CondSchema.info(%{name: "some_name", provide_details: true, details: nil})
{:error,
 [
   %Peri.Error{
     path: [:details],
     key: :details,
     content: %{},
     message: "is required",
     errors: nil
   }
 ]}

Now we receive the "expected" result.

3.2 New proposed type based :dependent schema

hmm interesting use case! i agree that this version of dependent type is more powerful than the custom type. and i already imagined a bunch of different use cases it would be unlock by this type definition. so yeah, i'll sure implement this one

zoedsoupe commented 4 months ago

so the proposed changes may be ( i would like to receive you feedback):

  1. the :cond type should still the same:
    • return true or false
    • the condition function should receive the data
    • this should be treated as required by default
  2. the :dependent type should now:
    • receive the data on the condition function
    • should return :ok or {:error, template, context}
    • this type should also be treated as required by default
  3. implement the new {:dependent, callback}, whereas:
    • the callback should be a 1 arity function and will receive the data
    • it should return {:ok, type} or {:error, template, context}
    • this type should also be treated as required by default

WDYT?

zoedsoupe commented 4 months ago

FYi multiple fields dependency is available on the main branch here

morfert commented 4 months ago

so the proposed changes may be ( i would like to receive you feedback):

1. the `:cond` type should still the same:

* return `true` or `false`

* the condition function should receive the data

* this should be treated as required by default

2. the `:dependent` type should now:

* receive the data on the condition function

* should return `:ok` or `{:error, template, context}`

* this type should also be treated as required by default

3. implement the new `{:dependent, callback}`, whereas:

* the callback should be a 1 arity function and will receive the data

* it should return `{:ok, type}` or `{:error, template, context}`

* this type should also be treated as required by default

WDYT?

Yeah, I think this all sounds great. I would prefer that data is the full input data for all cases (I tried master and realized that {:dependent, callback} only gets the nested data). Getting the full input may require more extensive refactoring though so I am happy with the new features so far. 👍

zoedsoupe commented 4 months ago

Getting the full input may require more extensive refactoring

yeah, i'll try to tackle that on this week

zoedsoupe commented 4 months ago

hey @morfert, try the last commit here. i thought it would need a more deep refactor but actually i found a very simple solution!

morfert commented 4 months ago

Yeah, the commit works, I have access to the full input. Thanks.

zoedsoupe commented 4 months ago

perfect! so I just launched a new version that ship with these fixes! ty again