elixir-ecto / ecto

A toolkit for data mapping and language integrated query.
https://hexdocs.pm/ecto
Apache License 2.0
6.18k stars 1.43k forks source link

`put_assoc/3` apparently ignores `unique_constraint` on a deeply nested entity if the passed entity is valid #4532

Open lucavenir opened 2 days ago

lucavenir commented 2 days ago

Elixir version

Elixir 1.16.2 (compiled with Erlang/OTP 26)

Database and Version

PostgreSQL 16.1

Ecto Versions

3.12.4

Database Adapter and Versions (postgrex, myxql, etc)

postgrex 0.19.1

Current behavior

I've lost quite some time on this, but in the hope of helping someone out there, I want to double check this is expected behavior. The following paper cut seems not to be addressed in the forums, nor in the docs (AFAIK!), so maybe it's worth posting it here.

TLDR

I have a 2-steps validation scenario, in which I need to map an intermediate embedded schema to my entity/db schema, as suggested in the docs.

The docs mention mapping the schema into a map (the to_profile bit of that page), but that would require me to cast that map into my db schema again, and I'd like to avoid that; I would avoid casting on internal parameters: indeed, this is internal data, which is validated already, so why would I need casting again!

So, instead of a plain map, I tried building my internal data via put_assoc (after reading this), but I got hurt with a unique_constraint validation, which apparently isn't applied if the inputted schema is valid.

This might make sense at first, because unique_constraint is run only when hitting the db, but it really doesn't click with me if, indeed, hitting a Repo.insert raises instead of returning a {:error, _} changeset (?).

Context

Given that a phone_number already exists on my db, I'm writing a test that expects {:error, _} when creating a %User with the same phone number.

The user creation flow is the following:

  1. The inputted data is validated against an intermediate embedded schema.
  2. The phone number given in the intermediate schema must be validated via an external service (OTP)
  3. I sign up my account on an external auth service, using the info from the intermediate schema (firebase in this case)
  4. Finally, I combine the above (intermediate+OTP+firebase information), so that I can build a User.changeset
  5. I can finally commit via Repo.insert

The flow above is why I've found the intermediate embedded schema is so useful: I can validate a user email, password, and so on, before ever touching external services, or even worse my db.

To clarify, this is my User schema:

defmodule MyApp.Accounts.Users.User do
  use Ecto.Schema
  import Ecto.Changeset

  @primary_key false

  schema "accounts_users" do
    field :referral_code, Ecto.UUID, writable: :insert

    belongs_to :account, MyApp.Accounts.Account, primary_key: true

    timestamps(type: :utc_datetime)
  end

  # changesets...
end

Now as you can see User is a weak entity; its parent's schema, Account, is the following:

defmodule MyApp.Accounts.Account do
  use Ecto.Schema
  import Ecto.Changeset

  @roles [ ...  ]

  schema "accounts" do
    field :role, Ecto.Enum, values: @roles
    field :firebase_uid, :string, writable: :insert
    field :deactivated_at, :utc_datetime

    has_one :account_info,
            MyApp.Accounts.AccountsInfo.AccountInfo,
            on_replace: :update

    has_one :user, MyApp.Accounts.Users.User
    # ... other has_one relationships

    timestamps(type: :utc_datetime)
  end

  @doc false
  def changeset(account, attrs) do
    account
    |> cast(attrs, [:firebase_uid, :role, :deactivated_at])
    |> cast_assoc(:account_info)
    |> validate_required([:firebase_uid, :role])
    |> unique_constraint(:firebase_uid)
  end
end

Here's the intermediate (embedded) schema.

defmodule MyApp.Accounts.Account.Create do
  use Ecto.Schema
  import Ecto.Changeset

  alias MyApp.PhoneNumbers.PhoneNumber

  embedded_schema do
    field :email, :string
    field :password, :string
    field :phone_number, PhoneNumber # Ecto custom type
    field :first_name, :string
    field :last_name, :string
  end

  def changeset(create, attrs) do
    create
    |> cast(attrs, [:email, :password, :phone_number, :first_name, :last_name])
    |> validate_required([:email, :password, :phone_number, :first_name, :last_name])
    |> validate_length(:password, min: 12)
    |> validate_format(:email, ~r/^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$/)
  end
end

Finally, Account has_one AccountInfo, another weak entity. This is where I store some of the intermediate data. Here's its schema:

defmodule MyApp.Accounts.AccountsInfo.AccountInfo do
  use Ecto.Schema
  import Ecto.Changeset

  alias MyApp.PhoneNumbers.PhoneNumber

  @primary_key false

  schema "accounts_info" do
    field :phone_number, PhoneNumber
    field :first_name, :string
    field :last_name, :string

    belongs_to :account, MyApp.Accounts.Account,
      on_replace: :mark_as_invalid,
      primary_key: true

    timestamps(type: :utc_datetime)
  end

  @doc false
  def changeset(account_info, attrs) do
    account_info
    |> cast(attrs, [
      :phone_number,
      :first_name,
      :last_name
    ])
    |> validate_required([
      :phone_number,
      :first_name,
      :last_name
    ])
    |> unique_constraint(:phone_number)  # !!
  end
end

Paper cuts

The above unique_constraint is what gave me a hard time today.

Indeed, as I wrote up above, I'll eventually need to map Account.Create into a User. To do so, I've tried the following, as mentioned in the cheatsheets:

# given `%{}Account.Create = create` manually build the account
account = %Account{
  firebase_uid: fuid,
  role: :user,
  account_info: %AccountInfo{
    phone_number: create.phone_number,
    first_name: create.first_name,
    last_name: create.last_name
  }
}

# put the account into my user
%User{}
|> change()
|> put_assoc(:account, account) 
|> put_change(:referral_code, Ecto.UUID.generate())

But when I run the above changeset on a Repo.insert, I get:

  ** (Ecto.ConstraintError) constraint error when attempting to insert struct:

  * "unique_accounts_info_phone_number" (unique_constraint)

  If you would like to stop this constraint violation from raising an
  exception and instead add it as an error to your changeset, please
  call `unique_constraint/3` on your changeset with the constraint
  `:name` as an option.

  The changeset has not defined any constraint. (!?)

This is a classic Ecto error: it says I'm violating a constraint without setting a changeset rule, so it defaults to a raise. Which is weird, because you can totally read the unique constraint changeset, above.

I doubled checked both in my migrations and my local test database, and I correctly see unique_accounts_info_phone_number in my DB. AFAIK this is okay, because Ecto defaults to "#{table}_#{prop}_index" as its naming convention; thus, my unique_contraint should be okay.

Luckily, the last part of the message was very informative (The changeset has not defined any constraint), but it made me think: how could it be Ecto can't see any constraint!?

I can easily solve this, by mapping my intermediate schema into another intermediate plain map, and then cast it to my entity, as if it was external data. Indeed the following correctly triggers {:error, ...}, mentioning that the phone has already been taken.

attrs = %{
  "account" => %{
    "firebase_uid" => fuid,
    "role" => :user,
    "account_info" => %{
      "phone_number" => create.phone_number,
      "first_name" => create.first_name,
      "last_name" => create.last_name
    }
  }
}

%User{}
|> cast(attrs, [])
|> cast_assoc(:account, required: true)

I think the problem is that since an unique_constraint is checked only against a Repo operation, and since put_assoc is putting an apparently valid entity into the changeset, Ecto skips the unique_constraint entirely.

Is my hypothesis correct? But most importantly, is this expected behavior?

Expected behavior

I'm not sure, hence this issue.

greg-rychlewski commented 2 days ago

Hi @lucavenir ,

If I understand correctly, you are doing this

account = %Account{
  firebase_uid: fuid,
  role: :user,
  account_info: %AccountInfo{
    phone_number: create.phone_number,
    first_name: create.first_name,
    last_name: create.last_name
  }
}

# put the account into my user
%User{}
|> change()
|> put_assoc(:account, account) 
|> put_change(:referral_code, Ecto.UUID.generate())

And then calling Repo.insert on the result. So you are passing your associations as structs rather than changesets. Ecto won't know about the constraint unless you call this code and then pass the resulting changeset:

def changeset(account_info, attrs) do
    account_info
    |> cast(attrs, [
      :phone_number,
      :first_name,
      :last_name
    ])
    |> validate_required([
      :phone_number,
      :first_name,
      :last_name
    ])
    |> unique_constraint(:phone_number)  # !!
  end
greg-rychlewski commented 2 days ago

The constraint information is stored on the changeset struct after calling the unique_constraint function. Then the error from Postgres is compared against it. So passing a plain %AccountInfo struct won't have that information.

lucavenir commented 2 days ago

First of all, I'm so sorry if I've made you waste time. It wasn't my intention. I've read the docs, like almost every page, especially the "mapping" part and the "cast vs put" parts. I thought there was something not working on Ecto's end!

But instead I missed this bit on put_assoc/4 inline docs:

The associated data may be given in different formats:

  • a map or a keyword list [..] the given map (or struct) is not validated in any way and will be inserted as is.
  • changesets [..] they are treated as the canonical data and the associated data currently stored in the association is either updated or replaced
  • structs [..] they are treated as the canonical data and the associated data currently stored in the association is replaced

So yeah, that's the problem. It looks that mapping from an intermediate schema to a concrete schema must pass from an intermediate map / keyword list, else the changeset will short circuit a struct.

Nonetheless, I think we might exploit this issue as a documentation change proposal.

At the end of this paragraph, I read:

Note we have used MyApp.Repo.insert_all/2 to add data to both "accounts" and "profiles" tables directly. We have chosen to bypass schemas altogether. However, there is nothing stopping you from also defining both Account and Profile schemas and changing to_account/1 and to_profile/1 to respectively return %Account{} and %Profile{} structs. Once structs are returned, they could be inserted through the usual MyApp.Repo.insert/2 operation. Doing so can be especially useful if there are uniqueness or other constraints that you want to check during insertion.

The bold quoted part confuses me the most: it's suggested to return structs, so that an insert operation could check for uniqueness. Which is exactly what I've tried.

josevalim commented 2 days ago

Very good feedback! We should improve those sections for clarity and mention that they could be further wrapped in their own changesets.

lucavenir commented 2 days ago

I'd be up for a PR on this, but I fear my lack of expertise might get in the way.

greg-rychlewski commented 2 days ago

You don't have to worry about that :). We will collaborate with you after the PR is submitted.

But it's entirely you're choice. If you are interested then please go for it:). If not then we will take it up.