byroot / activerecord-typedstore

ActiveRecord::Store but with type definition
MIT License
437 stars 57 forks source link

Updates to one JSON column overwrite other unrelated columns #98

Closed robingram closed 1 year ago

robingram commented 1 year ago

I've spent a couple of days trying to track down an issue with a sidekiq job that completes successfully but database writes that should have occurred in the job are not happening.

We started noticing this shortly after upgrading to version 1.5.1 and think we've tracked the problem down to a change of behaviour that was introduced in that update.

The model contains multiple JSON columns that use typed_store and we are running a couple of jobs that would update different columns. Examining the queries that are run when one of the JSON columns is updated we can see that all other empty JSON columns are also updated to nil even though they are not part of the query.

I put together a simple example using this class:

class TypedExample < ApplicationRecord
  typed_store :pet_metadata, coder: ActiveRecord::TypedStore::IdentityCoder do |p|
    p.string :pet_name
    p.string :pet_type
    p.integer :pet_age
  end

  typed_store :car_metadata, coder: ActiveRecord::TypedStore::IdentityCoder do |p|
    p.string :car_make
    p.integer :car_year
  end

  typed_store :house_metadata, coder: ActiveRecord::TypedStore::IdentityCoder do |p|
    p.string :house_address
    p.integer :house_price
  end
end

and this example record:

#<TypedExample:0x00000001087756d8
 id: 1,
 pet_metadata: {"pet_age"=>3, "pet_name"=>"Fido", "pet_type"=>"Dog"},
 car_metadata: {},
 house_metadata: {},
 created_at: Wed, 16 Nov 2022 11:00:59.633118000 NZDT +13:00,
 updated_at: Wed, 16 Nov 2022 11:01:37.537524000 NZDT +13:00>

See this example updating fields of the house_metadata column:

ex1.update(house_address: nil)

TypedExample Update (2.6ms)  UPDATE "typed_examples" SET "pet_metadata" = $1, "updated_at" = $2,
"car_metadata" = $3, "house_metadata" = $4 WHERE "typed_examples"."id" = $5
[["pet_metadata", "{\"pet_age\":3,\"pet_name\":\"Fido\",\"pet_type\":\"Dog\"}"],
["updated_at", "2022-11-15 22:12:00.834036"], ["car_metadata", nil], ["house_metadata", nil],
["id", 1]]

Clearly all JSON columns are updated even though the query only targeted house_metadata.

In 1.5.0, only columns that already had a value were updated so side effects would be less widespread.

ex1.update(house_address: nil)

TypedExample Update (0.8ms)  UPDATE "typed_examples" SET "pet_metadata" = $1, "updated_at" = $2
WHERE "typed_examples"."id" = $3  [["pet_metadata", "{\"pet_age\":3,\"pet_name\":\"Fido\",\"pet_type\":\"Dog\"}"],
["updated_at", "2022-11-15 22:16:28.628396"], ["id", 1]]

We believe that this has introduced a race condition where another job has updated a different JSON column but this has also updated other JSON columns to nil.

Can you give any background on why updating one JSON column should have any effect on other columns? It seems like dangerous behaviour.

In the meantime we are rolling back to 1.5.0 and putting in other mitigations to try to avoid similar race conditions.

byroot commented 1 year ago

It's all explained in the PR that implemented the fix: https://github.com/byroot/activerecord-typedstore/pull/95/files#diff-6879ddf3a0f944a6507f6eb35e4c6f08698f51ffdee5c342e452070f07e7ec80R85-R88