Netflix / fast_jsonapi

No Longer Maintained - A lightning fast JSON:API serializer for Ruby Objects.
Apache License 2.0
5.07k stars 425 forks source link

Different underlying models used in benchmarks skew results #65

Open aarongough opened 6 years ago

aarongough commented 6 years ago

The benchmarks use underlying models that are fundamentally different. The models used by fast_jsonapi in the benchmarks are plain Ruby classes, whereas the models used in the ams parts of the benchmarks inherit from ActiveModelSerializers::Model.

This difference skews the benchmarks significantly in favor of fast_jsonapi, after updating the models for the fast_jsonapi parts of the benchmark to also inherit from ActiveModelSerializers::Model several of the benchmark specs fail:

Failures:

  1) FastJsonapi::ObjectSerializer when comparing with AMS 0.10.x and with includes and meta should serialize 25 records atleast 25 times faster than AMS
     Failure/Error: expect { our_serializer.serialized_json }.to perform_faster_than { ams_serializer.to_json }.at_least(speed_factor).times
       expected given block to perform faster than comparison block by at_least 25 times, but performed faster by 14.87 times
     # ./spec/lib/object_serializer_performance_spec.rb:105:in `block (4 levels) in <top (required)>'

  2) FastJsonapi::ObjectSerializer when comparing with AMS 0.10.x and with includes and meta should serialize 250 records atleast 25 times faster than AMS
     Failure/Error: expect { our_serializer.serialized_json }.to perform_faster_than { ams_serializer.to_json }.at_least(speed_factor).times
       expected given block to perform faster than comparison block by at_least 25 times, but performed faster by 14.78 times
     # ./spec/lib/object_serializer_performance_spec.rb:105:in `block (4 levels) in <top (required)>'

  3) FastJsonapi::ObjectSerializer when comparing with AMS 0.10.x and with includes and meta should serialize 1000 records atleast 25 times faster than AMS
     Failure/Error: expect { our_serializer.serialized_json }.to perform_faster_than { ams_serializer.to_json }.at_least(speed_factor).times
       expected given block to perform faster than comparison block by at_least 25 times, but performed by the same time
     # ./spec/lib/object_serializer_performance_spec.rb:105:in `block (4 levels) in <top (required)>'

Classes used in benchmarks:

AMS benchmarks:

class AMSActor < ActiveModelSerializers::Model
  attr_accessor :id, :name, :email
end

fast_jsonapi benchmarks:

class Actor
  attr_accessor :id, :name, :email
end
christophersansone commented 6 years ago

@aarongough Hmm. If I understand correctly, I'm not sure I agree. You cannot pass Actor into an AMS serializer and have it work. AMS requires the model to be a descendant of ActiveModelSerializers::Model (or IIRC, at the very least have a read_attribute_for_serialization method), and this is what AMS recommends. fast_jsonapi is not bound by that limitation.

It would not be fair to penalize fast_jsonapi by adding extra baggage that is required by the other serializer. If it can work with plain old Ruby objects, isn't that an advantage that causes a real-world performance improvement?

aarongough commented 6 years ago

@christophersansone I don't disagree at all that being able to serialize other objects is an advantage, but my impression is that the main use-case for fast_jsonapi is serializing ActiveRecord objects, so the benchmarks should represent that.

Also, if the speed gain comes from a place other than the library under test itself that should still be isolated and removed when we're specifically trying to benchmark the library...

christophersansone commented 6 years ago

@aarongough Ahh okay thanks for the clarification. So your belief is that, if we are simply serializing plain old Ruby objects then fast_jsonapi is 25x, but if we are serializing ActiveRecord models, then the performance difference will be less?

Good catch and good theory. My opinion is that simply forcing the fast_jsonapi test models to descend from ActiveModelSerializers::Model does not test this theory adequately. Really, if we want to test the performance difference with ActiveRecord models, then we should test the performance difference with ActiveRecord models! :smile:

As it stands, my opinion is that the existing tests should be left as is, because this is an accurate apples-to-apples benchmark of the ability to serialize POROs.

Would you be able to create a dummy project that tested this theory with ActiveRecord, and provide the results?

shishirmk commented 6 years ago

@aarongough We may need to clean up and remove the json string serialization performance tests altogether or make them information only. The gem controls only the ruby hash creation time, after that its just multi json and the chosen engine.

We could use the same models to ensure that we are doing a fairer comparison. Do you mind creating a separate rspec just for this?

Here's a write up about the performance methodology in case you are curious about how we ended up getting the performance gain. https://github.com/Netflix/fast_jsonapi/blob/master/performance_methodology.md

aarongough commented 6 years ago

I don't have a strong opinion one way or the other, as unfortunately I will not be able to use fast_jsonapi for the project that I was evaluating it for due to the structure of responses we need...

However I do think that it's important that benchmarks are apples-to-apples as much as possible, especially in cases like this where a big part of the performance gain is coming from outside the code under test! In this case at least the fix is easy, here is a patch to make it a fairer comparison:

diff --git spec/shared/contexts/movie_context.rb spec/shared/contexts/movie_context.rb
index 8a64881..4e04f41 100644
--- spec/shared/contexts/movie_context.rb
+++ spec/shared/contexts/movie_context.rb
@@ -3,7 +3,7 @@ RSpec.shared_context 'movie class' do
   # Movie, Actor Classes and serializers
   before(:context) do
     # models
-    class Movie
+    class Movie < ActiveModelSerializers::Model
       attr_accessor :id, :name, :release_year, :actor_ids, :owner_id, :movie_type_id

       def actors
@@ -28,11 +28,11 @@ RSpec.shared_context 'movie class' do
       end
     end

-    class Actor
+    class Actor < ActiveModelSerializers::Model
       attr_accessor :id, :name, :email
     end

-    class MovieType
+    class MovieType < ActiveModelSerializers::Model
       attr_accessor :id, :name
     end
christophersansone commented 6 years ago

@aarongough But ActiveModelSerializers::Model != ActiveRecord::Base.

If we want to test performance of ActiveRecord models, they should descend from ActiveRecord::Base. In theory, then we wouldn't even need different model classes... we could use the same model instances and pass them through each serializer for this particular benchmark.

aarongough commented 6 years ago

@christophersansone ultimately it doesn't matter what the underlying model is, as long as it's the same across all benchmarks, that's the important thing.

shishirmk commented 6 years ago

While I dont think that it is causing huge discrepancies in performance numbers. I agree with the above comment of @aarongough. In order to get the same underlying model for all tests we should refactor the movie context and ams_movie context files so they are defining serializers to one set of underlying classes.