DmitryTsepelev / store_model

Work with JSON-backed attributes as ActiveRecord-ish models
MIT License
1.04k stars 85 forks source link

StoreModel Stringifying json when saving. #150

Closed jmarsh24 closed 1 year ago

jmarsh24 commented 1 year ago

Has anybody used the store model gem to store json in a field on a model? I have generated a store_model struct and I'm attempting to saving to a metadata field on video.

The struct has a youtube and music with nested json fields. When it's stored to the database it stringifies the json of the nested fields on it's own. I can save it using update_column and store model can read it just fine either way. But I was wondering if I'm doing something wrong. 🤔

I have included the logs of my rails console session. Where I have the structs, show that it's correct json. Save the field and read the field without typecast (the way it's stored in the database and it's stringified)

link to gist with the behavior

https://gist.github.com/jmarsh24/57b983f8591d0868bf4ffe8f46618217

The repo where I construct the store_model class.

https://github.com/jmarsh24/tangotube/blob/main/app/models/external_video_import/crawler.rb

DmitryTsepelev commented 1 year ago

I never tried to use it with json. Looks like there's something to be fixed in the nested encoder. Could you please try to reproduce the issue with the unit test in this repo?

trumanshuck commented 1 year ago

I recently ran in to this same issue and was able to reproduce with a couple of tests. Here are a couple of pull requests against a fork that illustrate the functionality failing and working:

This fails also in v2.0.0, so it's presumably something about serialization that changed in the major version. This becomes an issue because writing jsonb sql queries fails when the nested json is stringified -- e.g.,

  select * from products where configuration->'suppliers' @> '[{"title": "The Supplier"}]'::jsonb;

would not return any rows in the first example, but should in the second. Note that this only works in PostgreSQL; I'm not sure how to check for a value in a json array in sqlite. Lemme know if there's any further info that I can provide!

mweitzel commented 1 year ago

Wow, thank you for pointing to a safe version to revert and lock to until this is resolved. This is a huge regression that I almost just shipped.

jmarsh24 commented 1 year ago

@DmitryTsepelev You said that you have never tried to use it with json? Isn't the whole point of the gem to created add type safety to a stores json field? 🤔

Is there any proposed fix for this issue? I would love to shadow or pair with somebody to fix this issue.

balbesina commented 1 year ago

I used RawJson object to wrap Type::Json#serialize result and patch on StoreModel::Types::One#serialize and StoreModel::Types::Many#serialize.
after v1.6.2 #serialize of storemodel/types feeds ActiveSupport::JSON.encode with value.values_for_database. this affects json fields and nested storemodel models - they are twice jsonified, deeper nested - triple jsonified and so on. serialize and deserialize become asymmetric and we cant use pg json(b) features on store_model attributes.

class B
  include StoreModel::Model
  attribute :b_json, :json
end

class A
  include StoreModel::Model
  attribute :a_json, :json
  attribute :b_nested, B.to_type
end

a_type = StoreModel::Types::One.new(A)
a = a_type.deserialize(
  '{"a_json": {"a":"json"}, "b_nested": {"b_json": {"b": "json"}}}'
)

puts a_type.serialize(a)
# => {"a_json":"{\"a\":\"json\"}","b_nested":"{\"b_json\":\"{\\\"b\\\":\\\"json\\\"}\"}"}

if we use as_json instead of values_for_database its ok. it was broken by #146. instead of fixing lockbox they shifted problem from the sick head to a healthy one.

matid commented 1 year ago

I wanted to point out we’ve also experienced this with nested StoreModels, since they’re represented as JSON under the hood.

So this is also broken in the current release:

class A
  include StoreModel::Model

  attribute :a, :string
end

class B
  include StoreModel::Model

  attribute :a, A.to_type
end

The nested A model will get serialized to a string, instead of being stored as native JSON in the database.

jmarsh24 commented 1 year ago

Does anybody have experience making PR's to the gem? We could pair to try and solve this issue.

morgangrubb commented 1 year ago

The problem isn't making a PR, it's that the lockbox encrypted strings are fundamentally incompatible with jsonb queries and that change needs to be rolled back or made opt-in.

On Thu, Jul 27, 2023, 3:22 AM Justin Marsh @.***> wrote:

Does anybody have experience making PR's to the gem? We could pair to try and solve this issue.

— Reply to this email directly, view it on GitHub https://github.com/DmitryTsepelev/store_model/issues/150#issuecomment-1653331125, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABPY6LWBPPWOCNKQBOQYVDXSI6P7ANCNFSM6AAAAAAZV5H2NU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mweitzel commented 1 year ago

@morgangrubb an encrypted field is not query-able, of course.

The issue is that nested models receive multiple serialization passes and the inner objects are not query-able. Instead of a single-pass flat json object, nested models and json content are serialized multiple times. So what should be structured json is now stored as doubly or triply escaped strings. A fix for this is compatible with the encrypted fields.

vsuhachev commented 1 year ago

In my opinion MR #146 incorrectly implements the Rails Attributes API. The serialize and deserialize functions are meant to be encode into and decode from the DB representation the attributes at top level of ActiveRecord model, but nested structures should not be serialized/deserialized using DB representation. Before #146 the as_json is used to encode nested structures and it is a proper way IMO..

Also see comments in ActiveModel::Type::Value

      # Converts a value from database input to the appropriate ruby type. The
      # return value of this method will be returned from
      # ActiveRecord::AttributeMethods::Read#read_attribute. The default
      # implementation just calls Value#cast.
      #
      # +value+ The raw input, as provided from the database.
      def deserialize(value)

      # Casts a value from the ruby type to a type that the database knows how
      # to understand. The returned value from this method should be a
      # +String+, +Numeric+, +Date+, +Time+, +Symbol+, +true+, +false+, or
      # +nil+.
      def serialize(value)

Also #146 solution is incomplete because all others except StoreModel::Types::One worked with old behavior.

morgangrubb commented 1 year ago

@DmitryTsepelev Care to weigh in on how to address this issue? Right now any 2.x version completely breaks database-backed json datatype queries which was a major selling point prior.

DmitryTsepelev commented 1 year ago

Sorry guys I don't know why github thinks that I should not receive any notifications about replies.

You said that you have never tried to use it with json? Isn't the whole point of the gem to created add type safety to a stores json field?

Every time I used JSONB.

Rushing to the PR right now

DmitryTsepelev commented 1 year ago

Released 2.1.1 with the fix, thank you guys for your patience 😅

mweitzel commented 1 year ago

Huge reminder to folks who had this issue. If you did not lock to 1.6.2, and updated these models, the 2.1.1 release will not resolve persisted values. You'll need to perform a backfill to address the affected models.

I think, for the most part, the nested models will still inflate properly.

For example, with a given Foo model and nested StoreModel attributes:

class Foo < ApplicationRecord
  class NestedStore
    include StoreModel::Model
    attribute ...
  end
  class Store
     attribute :nested, NestedStore.to_type # this is the field which would have been doubly-serialized
     attribute ...
  end
  attribute :store, Store.to_type
end

the following backfill did correct the shape of data in the json field

# Foo.in_batches..
foo.store_will_change! && foo.save

But again, please test for your individual use-cases.

Thanks for cutting a release, @DmitryTsepelev. A large reason I left this comment is so it could be referenced in the changelog.