rails-api / active_model_serializers

ActiveModel::Serializer implementation and Rails hooks
MIT License
5.32k stars 1.39k forks source link

Key translation mutates JSON attributes on serialized models #1834

Closed bdmac closed 4 years ago

bdmac commented 8 years ago

Expected behavior vs actual behavior

https://github.com/rails-api/active_model_serializers/commit/c533d1a7fe202a904af178cd22c54cc14383bab0 added support for key transformation to e.g. snake case or dasherized. Nice and useful however it appears to use deep_transform_keys!. I'm not 100% sure but I have a feeling this is causing serialized models with JSON attributes (e.g. postgres JSON columns) to be mutated when they should not be.

In short serialization of a model object with JSON columns leaves that model object with dirty with unsaved (and obviously unexpected) changes.

Expected behavior is to NOT mutate model attributes during serialization.

Steps to reproduce

Setup a model with a JSON column. Serialize an instance of that object. Notice that the JSON attribute is now dirty/has changes on the model.

> model = Model.first
# Assume model.schema is a JSON attribute that is included in the serializer for Model
> ActiveModel::SerializableResource.new(model).to_json
> model.schema_changed?
true

Environment

ActiveModelSerializers Version (commit ref if not on tag): 0.10.2

Output of ruby -e "puts RUBY_DESCRIPTION": ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-darwin14]

OS Type & Version: OSX El Capitan

Integrated application and version (e.g., Rails, Grape, etc): Rails

Additonal helpful information

This bug is not present in versions of AMS prior to this commit: https://github.com/rails-api/active_model_serializers/commit/c533d1a7fe202a904af178cd22c54cc14383bab0

bdmac commented 8 years ago

A second part of this bug is that I do not think it would be expected behavior that if a model includes a JSON attribute, that that JSON attribute would ALSO go through the same transformations as the model/response itself. I mean maybe but it seems like that should be something that can be opted in/out at least.

bdmac commented 8 years ago

I can fix the mutation problem by simply not using the bang version of deep_transform_keys! in ActiveModelSerializers::KeyTransform.

I'm not sure how to make it so a JSON attribute of a serialized object is not transformed...

bdmac commented 8 years ago

Eek yah I tried to look at KeyTransform a bit and it seems like it would be pretty damn hard to make it not transform JSON model attributes since it just gets the entire Hash structure and recursively transforms all the keys at once. It does not have access to the underlying/original models to even try to check if the nested JSON it's dealing with is there because of a JSON attribute/column or not...

bf4 commented 8 years ago

@bdmac we shouldn't be mutating the model, no matter what. Is that specific to using a JSON field? I'd be happy to work with you on PR for that.

bdmac commented 8 years ago

@bf4 I believe so, yes, because we use the mutating version of deep_transform_keys and it seems like the hash representing the JSON column is sharing a pointer with what was pulled from the model originally still. E.g. at some point when we are building up the hash representation of the model with a JSON attribute we are just doing hash[attribute_name] = model.read_attribute(attribute_name).

Two possibilities I can think of to solve the initial problem:

  1. Don't use the ! version in KeyTransform. This would have some memory overhead.
  2. Figure out where exactly we are pulling the JSON values out of the model for serialization and dup it there. Continue using ! version in KeyTransform as now we would be safely dissociated from the model attribute.
bf4 commented 8 years ago

@bdmac fwiw, the key transform is a performance hit you might resolve better on the client...

bdmac commented 8 years ago

@bf4 that may be true but transforming is the default behavior of the JsonApi adapter in AMS.

bf4 commented 8 years ago

@bdmac yeah, in retrospect I don't think it was a good decision. I have it set to :unaltered in my own app. Picking good defaults is hard

bdmac commented 8 years ago

I mean maybe it was a bad decision but the JSON API spec calls for dasherized keys so even if you did not automatically dasherize, the end-user would end up needing to dasherize anyways. We were dasherizing prior to attempting to update to 0.10.2 (which includes the KeyTransform stuff) just in a way that did not mutate (or recursively dasherize everything).

bdmac commented 8 years ago

@bf4 FWIW I've been able to work around this problem. I turned off key transformation as you mentioned by setting key_transform config to :unaltered. Here is our initializer now for reference:

class DasherizedJsonApiAdapter < ActiveModel::Serializer::Adapter::JsonApi
  private

  def resource_identifier_for(serializer)
    hash = super
    hash[:type] = hash[:type].to_s.underscore.dasherize
    hash
  end

  def resource_object_for(serializer)
    hash = super
    if hash[:attributes].present?
      hash[:attributes].transform_keys!{|key| key.to_s.underscore.dasherize}
    end
    hash
  end

  def relationships_for(serializer)
    super.transform_keys!{|key| key.to_s.underscore.dasherize}
  end
end

ActiveModel::Serializer.config.adapter = DasherizedJsonApiAdapter
ActiveModel::Serializer.config.key_transform = :unaltered

We were already/previously using the custom DasherizedJsonApiAdapter adapter since pre-release versions of 0.10.x did not have support for key transform / dasherized keys. I was hoping to get rid of our custom adapter by just setting the adapter to :json_api when we upgraded to 0.10.2 but that brought up all these issues...

snovity commented 8 years ago

Had the same issue, implemented an alternative solution in initializer

module ActiveModelSerializers::KeyTransform

  LAST_NODES = [
    [:data, :attributes],
    [:included, Array, :attributes]
  ]

  def self.modest_dash(value, path = [])
    if Hash === value
      value.transform_keys! do |key|
        key_value = value[key]
        if (Hash === key_value || Array === key_value) && !LAST_NODES.include?(path)
          modest_dash(key_value, path + [key.to_sym])
        end
        dash(key)
      end
    elsif Array === value
      path += [Array]
      value.each { |e| modest_dash(e, path) }
    else
      dash(value)
    end
  end

end

ActiveModelSerializers.config.key_transform = :modest_dash
ActiveModelSerializers.config.adapter = :json_api

Not sure, but maybe it would make sense to give KeyTransform access to current serializer object.

remear commented 8 years ago

@bdmac Regarding json attributes, https://github.com/rails-api/active_model_serializers/issues/1834#issuecomment-231464439.

Are you saying it's currently changing the case of keys in the value of each json attribute of the model? KeyTransform wasn't originally designed to do that. I use jsonb fields in my Rails API applications and that hasn't been my experience. If you're seeing the jsonb value's keys being altered, would you provide an example.

Possible culprit: https://github.com/rails-api/active_model_serializers/blob/a1826186e556b4aa6cbe2a2588df8b2186e06252/lib/active_model_serializers/adapter/json_api/deserialization.rb#L157-L161

bdmac commented 8 years ago

@remear yes that's exactly what I'm saying. I know it wasn't designed to and included the commit SHA that broke this and indicated what needed to be changed above as well (https://github.com/rails-api/active_model_serializers/issues/1834#issuecomment-231464895).

The problem has moved slightly as KeyTransform has been refactored but looking at it I would think it is still present. There are several places in there that use deep_transform_keys! (e.g. https://github.com/rails-api/active_model_serializers/blob/master/lib/active_model_serializers/key_transform.rb#L62). This method recursively transforms all hash keys using the provided function destructively w/out duping.

When your model that is being serialized has a json column, in memory it is a hash that is nested under the rest of that model's hash structure to be serialized so when you call deep_transform_keys! it mutates the keys in the hash but since it was not duped it actually winds up mutating the keys in the JSON hash on the original model object which leaves it in a dirty state.

It's possible this was fixed somewhere else in AMS by duping model attributes that are hashes but I can't say. Aside from my initial repro steps above I'm guessing you could repro this by just doing something like ActiveModelSerializers::KeyTransform.camel(model.as_json) and then check your model's JSON attribute or just do model.changes.

remear commented 8 years ago

Right, the mutating part of this issue is a separate concern to me from transforming keys of the value of a json attribute of a model and I don't want that problem to get forgotten when the mutation issue gets fixed.

Just so we're on the same page, I'm talking about a structure like:

create_table "companies" ... do |t|
  t.jsonb    "json_things"
end

company.json_things = {
  "rating": 8,
  "a_key": {
    "a_subkey": {
      "another_key": "value"
    }
  }
}

I would expect that nothing in the value of json_things to be transformed. rating, a_key, a_subkey, another_key and all their respective values would be unaltered. I haven't personally experienced key alteration for json fields as described above. I'm currently using 6e609c1218c38265b14382bf50be974e25f4cb9b and haven't had time to check if a newer version breaks this behavior. Therefore, I'm looking for additional verification that users are seeing these transformations, and if the problem exists in the latest master.

bdmac commented 8 years ago

Right, gotcha. I cannot confirm what happens in the latest master, just what I was using at the time when I reported this (we're on a fork at this point for other reasons). At that point in time using a KeyTransform like dasherize was deep transforming all the way down the model's hash structure into the json_things key on down. It was mutating these keys which was bad from both a "that's not what I expect in my serialized output" perspective and a "shit you actually changed my company.json_things attribute in my model" perspective.

A cursory glance at what's in master for KeyTransform still shows it using deep_transform_keys! which was the problem since by the time it gets there you wound up with something like:

initially_serialized_output = {
  "name": "My Company",
  "json_things": {
    "rating": 8,
    "a_key": {
      "a_subkey": {
          "another_key": "value"
        }
    }
  }
}
ActiveModelSerializers::KeyTransform.camel(initially_serialized_output)
=> {
  "name": "My Company",
  "jsonThings": {
    "rating": 8,
    "aKey": {
      "aSubkey": {
          "anotherKey": "value"
        }
    }
  }
}

Try that on your current version of AMS and then update to master and try that same test. The version we're forked from did not even have a KeyTransform class.

snovity commented 8 years ago

@remear I can confirm that KeyTransform indeed affects PostgreSQL JSON attributes and they are returned in payload with modified keys. I'm using AMS version 0.10.2. To fix it I've created my own version of deep transform that doesn't go deeper on certain key paths in resulting JSON object (not sure if it is a good solution).

NullVoxPopuli commented 8 years ago

@snovity so, a potential bug fix for this would be to limit key transform to a white list of the known serialized (attributes / relationships)'s keys?

bdmac commented 8 years ago

Yes that would work but needs a refactor because KeyTransfo has no knowledge of the serialized where attributes and relationships were defined. Also complex with includes because it touches multiple serialized at different levels.

NullVoxPopuli commented 8 years ago

Maybe KeyTransform could be passed a list of things to transform? or a list of things to not recurse over?

snovity commented 8 years ago

Skipping/white-listing certain paths should be adapter dependent, which complicates this solution. For JSON API making [:data, :attributes] and [:included, Array, :attributes] as last nodes, seems to work well.

NullVoxPopuli commented 8 years ago

@snovity if you want to make a PR for this, we'd be more than happy to iterate over it :-)

remear commented 8 years ago

limit key transform to a white list of the known serialized (attributes / relationships)'s keys

gross. Seems like this would also lead to a lot more complexity. E.g, how do you know what they are? What if the attributes are redefined in the serializer? Then comes the if/unless/only/except/my-dog-doesnt-like-this.

joelpresence commented 7 years ago

Hi,

Any updates on this? This is about to impact us in production ... We have some genres that we store on a user like:

# user.genres is a Postgres jsonb field
user.genres = { science_fiction: true, romance: true, horror: false }

and we do it like this for two reasons:

  1. So that we can do easy Postgres where clauses like select * from users where genres->>'science_fiction' = true and
  2. So that we can use the values like science_fiction as Ruby symbols.

If the JSONAPI adapter in AMS changes science_fiction to science-fiction then it really messes things up because either we need to constantly translate back and forth between Ember and Rails or we need to change Rails to use dasherized versions natively. We simply cannot all our values like science_fiction to science-fiction in Ruby because it would mean we couldn't easily use these values as Ruby symbols, enum keys, etc.

This is a violation of the Principle of Least Surprise in my humble (and respectful) opinion. AMS should leave keys in my jsonb fields alone when serializing.

Do we have an ETA for a fix? Or is there an easy work around I can use?

Thanks! And thanks for all of your hard work on AMS. :-)

bdmac commented 7 years ago

@joelpresence we had monkey patched this as I mentioned early on I think to just have https://github.com/rails-api/active_model_serializers/blob/master/lib/active_model_serializers/key_transform.rb call deep_transform_keys without the bang. That was an older pre-release version of 0.10 though so I don't know if that still applies.

Ultimately transforming the keys from underscores to dasherized for jsonapi turned out to be too costly from a performance POV so we actually wound up just NOT transforming the keys in AMS and instead adjusting our Ember stack to work with underscored attributes.

NullVoxPopuli commented 7 years ago

re: performance: you can always use this: https://travis-ci.org/NullVoxPopuli/case_transform-rust-extensions

bdmac commented 7 years ago

Steve is that drop-in upgrade that AMS will just start using automatically? It looks like you'd still have to make some updates to KeyTransform to swap in the methods provided by that lib?

That said the best performance hit is no performance hit so unaltered/passthru is still the best option IMO since it's trivial to get Ember to handle this on the client which kinda makes the transforms an automatic distributed job!

NullVoxPopuli commented 7 years ago

it's not quite a drop in, I had a PR to use CaseTransform the gem instead of KeyTransform the module, but it was couple with a bunch of other crap, including some of @beauby's old JSON API stuff. So, I need to make a new PR to enable case_transform-rust-extensions being a drop-in enhancement.

bdmac commented 7 years ago

👍

silentlight commented 6 years ago

Hi,

I am using Rails 5 and AMS 0.10.5 and a have the same problem with JSON attributes being mutated after serialization. Has this problem been fixed?

bf4 commented 6 years ago

@silentlight PRs welcome

jfine commented 6 years ago

Just ran into this as well. My quick (but dirty) fix was to...

class FooBarController < ApplicationController
  def show
    render json: @foo_bar, key_transform: :unaltered
  end
end
class FooBarSerializer < ActiveModel::Serializer
  type "foo-bars"
  attribute :not_dasherized, key: "not-dasherized"
end

...most important thing was having solid request specs.

rafaelcgo commented 6 years ago

Currently I have some non db-backed data (ActiveModel) that I'm nesting inside my model. This additional data still have non transformed json-api keys. I think it's related. Is there any way I can make the adapter transform the keys on all levels of nesting?

payment-subscription was transformed (1st level attribute) last_invoice wasn't (2nd level attribute) due_date wasn't (3rd level attribute)

class ContractSerializer < ActiveModel::Serializer
  attributes :kind, :payment_subscription

  def payment_subscription
    Payment::SubscriptionSerializer.new(object.payment_subscription) if object.payment_subscription
  end
end

class Payment::SubscriptionSerializer < ActiveModel::Serializer
  attributes :status, :last_invoice

  def last_invoice
    Payment::InvoiceSerializer.new(object.last_invoice) if object.last_invoice
  end
end

class Payment::InvoiceSerializer < ActiveModel::Serializer
  attributes :id, :due_date
end

Response:


{
  "data"=>
  {
    "id"=>"aa67c98c-d81f-5a9c-b0bc-26caa0051aea",
    "type"=>"contracts",
    "attributes"=>
    {
      "kind"=>"robbery_and_theft",
      "payment-subscription"=>
      {
        "status"=>"protected_but_cancelled",
        "last_invoice"=>
        {
          "id"=>nil,
          "due_date"=>"2018-06-05"
        }
      }
    }
  }
}
bf4 commented 6 years ago

@rafaelcgo I think what you bring up is a new issue. AMS certainly exposes the ability to customize your case transforming. But I don't recall if there are examples of using it to transform attribute members with JSONs object values

rafaelcgo commented 6 years ago

@bf4 Yep, I couldn't find way to manipulate those. If anyone has any tips on how to do this I could submit a PR and open a new issue.