Caching doesn't improve performance

bf4 commented 8 years ago

Expected behavior vs actual behavior

Expected: Configure a cache and using the AMS serializer cache method should improve rendering performance.

Actual: performance decreases AND more objects are allocated.

Steps to reproduce

current master: git co fa0bc95.

 bin/bench
caching on: caching serializers: gc off 606.0970710386515/ips; 1853 objects
caching off: caching serializers: gc off 526.5338285238549/ips; 1853 objects
caching on: non-caching serializers: gc off 709.8031139840541/ips; 1390 objects
caching off: non-caching serializers: gc off 746.4513428127035/ips; 1390 objects
Benchmark results:
{
  "commit_hash": "fa0bc95",
  "version": "0.10.0.rc4",
  "benchmark_run[environment]": "2.2.3p173",
  "runs": [
    {
      "benchmark_type[category]": "caching on: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 606.097,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1853
    },
    {
      "benchmark_type[category]": "caching off: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 526.534,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1853
    },
    {
      "benchmark_type[category]": "caching on: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 709.803,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1390
    },
    {
      "benchmark_type[category]": "caching off: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 746.451,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1390
    }
  ]
}

CACHE_ON=false bin/bench
caching on: caching serializers: gc off 664.8712562099971/ips; 1853 objects
caching off: caching serializers: gc off 613.6203762167032/ips; 1853 objects
caching on: non-caching serializers: gc off 752.267454951568/ips; 1390 objects
caching off: non-caching serializers: gc off 692.4981276214933/ips; 1390 objects
Benchmark results:
{
  "commit_hash": "fa0bc95",
  "version": "0.10.0.rc4",
  "benchmark_run[environment]": "2.2.3p173",
  "runs": [
    {
      "benchmark_type[category]": "caching on: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 664.871,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1853
    },
    {
      "benchmark_type[category]": "caching off: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 613.62,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1853
    },
    {
      "benchmark_type[category]": "caching on: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 752.267,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1390
    },
    {
      "benchmark_type[category]": "caching off: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 692.498,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1390
    }
  ]
}

Number vary somewhat over runs but differences are consistent.

Environment

ActiveModelSerializers Version 0.10.0.rc4, on ref fa0bc95
ruby -e "puts RUBY_DESCRIPTION"
- ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-darwin14]

OS Type & Version:

uname -a
- Darwin mbp14 14.5.0 Darwin Kernel Version 14.5.0: Tue Sep 1 21:23:09 PDT 2015; root:xnu-2782.50.1~1/RELEASE_X86_64 x86_64: Yosemite 10.10.15 Integrated application and version
bundle show activemodel
- .bundle/ruby/2.2.0/gems/activemodel-4.0.13
  Backtrace

N/A

Additional helpful information

https://blog.codeship.com/building-a-json-api-with-rails-5/
- By making these changes, we’ve changed our response time from 30ms to 50ms… wait, what? Yes, you heard me right. By adding cache, responses in my application have actually slowed down.
- https://twitter.com/leighchalliday/status/642734572703236096 and https://twitter.com/joaomdmoura/status/642801896231727104
- By looking at the flame graph with caching turned on, I could tell that 48 percent of the time was spent in the cache_check method or farther down in the stack trace. This seems to account for the slowdown from 30ms to 50ms. active_model_serializers-258f116c3cf5/lib/active_model/serializer/adapter.rb:110:incache_check'` (48 samples - 48.00%) Here’s an image of the flamegraph, which was produced by using rack mini profiler gem with the flamegraph gem. I’ve highlighted in black the portion that’s dealing with the cache.

Cache developments since then:

We now support read_multi

However:

before:

{
  "commit_hash": "43312fa^",
  "version": "0.10.0.rc3",
  "benchmark_run[environment]": "2.2.2p95",
  "runs": [
    {
      "benchmark_type[category]": "caching on: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 687.045,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1426
    },
    {
      "benchmark_type[category]": "caching off: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 688.588,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1426
    },
    {
      "benchmark_type[category]": "caching on: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 849.889,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1084
    },
    {
      "benchmark_type[category]": "caching off: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 769.596,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1084
    }
  ]
}

after:

{
  "commit_hash": "43312fa",
  "version": "0.10.0.rc3",
  "benchmark_run[environment]": "2.2.2p95",
  "runs": [
    {
      "benchmark_type[category]": "caching on: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 635.297,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1519
    },
    {
      "benchmark_type[category]": "caching off: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 601.3,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1519
    },
    {
      "benchmark_type[category]": "caching on: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 782.07,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1113
    },
    {
      "benchmark_type[category]": "caching off: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 771.094,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 1113
    }
  ]
}

So maybe we should take a look at usage in bulk_cache_fetcher

And with more objects it gets worse:

BENCH_STRESS=true bin/bench Benchmark results:

{
  "commit_hash": "e03c5f5",
  "version": "0.10.0.rc4",
  "benchmark_run[environment]": "2.2.3p173",
  "runs": [
    {
      "benchmark_type[category]": "caching on: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 164.688,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 10755
    },
    {
      "benchmark_type[category]": "caching off: caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 143.719,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 10755
    },
    {
      "benchmark_type[category]": "caching on: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 232.669,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 6690
    },
    {
      "benchmark_type[category]": "caching off: non-caching serializers: gc off",
      "benchmark_run[result][iterations_per_second]": 211.71,
      "benchmark_run[result][total_allocated_objects_per_iteration]": 6690
    }
  ]
}

Possibly related

Flamegraph

Flamegraph of master with bin/serve_benchmark start and the flamegraph gem

iff --git a/Gemfile b/Gemfile
index 3791eef..7be3d53 100644
--- a/Gemfile
+++ b/Gemfile
@@ -39,6 +39,8 @@ gem 'tzinfo-data', platforms: (@windows_platforms + [:jruby])
 group :bench do
   # https://github.com/rails-api/active_model_serializers/commit/cb4459580a6f4f37f629bf3185a5224c8624ca76
   gem 'benchmark-ips', require: false, group: :development
+  gem 'rack-mini-profiler', require: false
+  gem 'flamegraph'
 end

 group :test do
diff --git a/test/benchmark/app.rb b/test/benchmark/app.rb
index ae110ec..ffbc8cc 100644
--- a/test/benchmark/app.rb
+++ b/test/benchmark/app.rb
@@ -54,6 +54,14 @@ end

 require 'active_model_serializers'

+begin
+    require 'rack-mini-profiler'
+rescue LoadError # rubocop:disable Lint/HandleExceptions
+else
+  require 'flamegraph'
+  # just append ?pp=flamegraph
+end
+
 # Initialize app before any serializers are defined, for running across revisions.
 # ref: https://github.com/rails-api/active_model_serializers/pull/1478
 Rails.application.initialize!

bf4 commented 8 years ago

Apparently @joaomdmoura had already discussed this in https://github.com/rails-api/active_model_serializers/issues/1020. I missed this since the issue title was 'Understanding caching', but the contents were that caching made things worse. So, this has been a known issue since July 2015. Sigh.

bf4 commented 8 years ago

On interpreting Flamegraphs http://community.miniprofiler.com/t/how-to-deal-with-information-overload-in-flamegraphs/437?u=sam

beauby commented 8 years ago

Note: this benchmark is faulty since some legacy AMS idiosyncrasy made it so that the cached serializer actually did twice the work. Could you re-run @bf4?

bf4 commented 8 years ago

Sure. Related to my reference to https://github.com/rails-api/active_model_serializers/pull/1478 above, I'd like to remove per-serializer cache_store configuration. I just don't see the benefit for the complexity it adds.

bf4 commented 8 years ago

For updated benchmarks, see #1698

zaaroth commented 8 years ago

@bf4 I picked this issue since it might be related to performance boosts. Are there any plans to come up with something along the lines of ActiveRecord's preload for serializers? I see a lot of has_one relations that could be streamlined in a single call not using multi fetches. I also see some Collection serializations not using it as well in my tests (0.10.0 release, not master).

bf4 commented 8 years ago

@zaaroth there's improvements in master which I just released in 0.10.1. would love to discuss in amserializers.herokuapp.com slack. Thanks!

GCorbel commented 7 years ago

Is there any update on this issue ? Does caching improve performance now ?

beauby commented 7 years ago

@GCorbel Caching will only improve performance when serializing objects with computation-heavy attributes. I'm not aware of the current status of it though, although last I heard @bf4 had fixed it.

stephendolan commented 7 years ago

@bf4 Is this still an issue? It's still listed as a warning in the caching guide in master, but I can't seem to find any updates.

bf4 commented 7 years ago

It's much better thqn when I first made the issue, but I'm not yet satisfied to close it until I have some benchmarks

B mobile phone

On Jan 4, 2017, at 2:34 PM, Stephen Dolan notifications@github.com wrote:

@bf4 Is this still an issue? It's still listed as a warning in the caching guide in master, but I can't seem to find any updates.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mustela commented 7 years ago

Wondering if this is till an issue, can't find to much information around this.

bf4 commented 7 years ago

@mustela Basically, I haven't yet found an app I can test performance against in a way that make me comfortable changing the language. The perf is much improved since I made the issue, but between life and lacking a good test example, I just haven't followed up.

mustela commented 7 years ago

@bf4 would you mind to describe the app you are looking for? Maybe we can help with that.

bf4 commented 7 years ago

@mustela Probably the simplest thing to extract would be a setup with

records
db not necessarily sqlite
endpoint which serializes records, or somehow uses AMS
Serializers have caching (a simple superclass can have it off, and subclass on for benchmarking purposes, or I can turn it on or off, no problem there)
cache_store is set; is neither memory nor filesystem. (e.g. redis)
Integration tests/request specs are probably a good starting point

metaskills commented 7 years ago

I created a simple app to exercise the issue I was facing. https://github.com/customink/amstest

bf4 commented 7 years ago

@metaskills Thanks so much for this! Added an issue there https://github.com/customink/amstest/issues/1

mrsweaters commented 7 years ago

I can say that in my case caching worked like a charm. It saved me about 93% of serialization time. Using AMS 0.10.5. I'm serializing a lot of data though.

Before Caching:

After Caching:

(Images are from Skylight.io)

bf4 commented 7 years ago

@mrsweaters Fantastic! are you able to describe in general terms the nature of what you're serializing such that I can model it? like db tables, fields, indices, associations, number of items, how you've configured your serializers, etc?

mrsweaters commented 7 years ago

@bf4 I had to temporarily disable caching unfortunately because Carrierwave can't be serialized. Once I find a workaround I'll try to summarize my situation.

beauby commented 7 years ago

@mrsweaters Do you have overridden attributes that are costly to compute?

harbirg commented 7 years ago

@bf4 - I see how caching improves situations in testapp provided by metaskills where the controller is busy doing some computation.

However, in my case - where I try to serialize 10,000 records or so, it is still faster to regenerate json than fetch from memcached or redis. The sample app I used for this test was pretty straightforward with a model having 5 attributes, no relationships. Is this expected?

metaskills commented 7 years ago

However, in my case - where I try to serialize 10,000 records or so, it is still faster to regenerate json than fetch from memcached or redis

I saw this too. Basically it was due to excessive children caching and and poor support for russian doll strategies and/or read multi. We solved that first by caching at the top layer only, then moving to JBuilder and a solution with read multi support.

harbirg commented 7 years ago

I do see "[active_model_serializers] Cache read_multi: [" ...entries from dalli memcached ..."] in my output, so I'm assuming this means that its performing multi_read. Perhaps, the outstanding issue is something like Russian Doll strategy for caching where AMS would cache both individual entries and the entire response.

mustela commented 7 years ago

Hey everyone,

I'm trying to understand how cache works for AMS since I'm not sure if this is a bug or how it works but I've made a simple Rails API with a basic configuration: https://github.com/mustela/ams-cache.

The schema is simple: User => Memberships => Organizations

Cache enabled, Serializers and the controller which is returning the user + the organizations.

So basically when I request curl -X GET localhost:3000/users/1

Started GET "/users/1" for ::1 at 2017-04-19 23:29:07 +0200
Processing by UsersController#show as */*
  Parameters: {"id"=>"1"}
  User Load (0.6ms)  SELECT  "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
[active_model_serializers]   Organization Load (0.7ms)  SELECT "organizations".* FROM "organizations" INNER JOIN "memberships" ON "organizations"."id" = "memberships"."organization_id" WHERE "memberships"."user_id" = 1 ORDER BY "organizations"."name" ASC
[active_model_serializers] Rendered UserSerializer with ActiveModelSerializers::Adapter::Attributes (5.12ms)
Completed 200 OK in 7ms (Views: 4.9ms | ActiveRecord: 1.3ms)

This is the response that I get every time I call that endpoint. I understand that user is being read to create the cache key, but the organizations are not being cached.

Postgres transaction log, for every request:

db_1               | LOG:  execute <unnamed>: SELECT  "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
db_1               | LOG:  execute <unnamed>: SELECT "organizations".* FROM "organizations" INNER JOIN "memberships" ON "organizations"."id" = "memberships"."organization_id" WHERE "memberships"."user_id" = 1 ORDER BY "organizations"."name" ASC
db_1               | LOG:  execute <unnamed>: SELECT  "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
db_1               | LOG:  execute <unnamed>: SELECT "organizations".* FROM "organizations" INNER JOIN "memberships" ON "organizations"."id" = "memberships"."organization_id" WHERE "memberships"."user_id" = 1 ORDER BY "organizations"."name" ASC
db_1               | LOG:  execute <unnamed>: SELECT  "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
db_1               | LOG:  execute <unnamed>: SELECT "organizations".* FROM "organizations" INNER JOIN "memberships" ON "organizations"."id" = "memberships"."organization_id" WHERE "memberships"."user_id" = 1 ORDER BY "organizations"."name" ASC
db_1               | LOG:  execute <unnamed>: SELECT  "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
db_1               | LOG:  execute <unnamed>: SELECT "organizations".* FROM "organizations" INNER JOIN "memberships" ON "organizations"."id" = "memberships"."organization_id" WHERE "memberships"."user_id" = 1 ORDER BY "organizations"."name" ASC

In redis the keys are being saved:

localhost:6379> keys *
1) "organizations/1-20170419212733959570/attributes/a74db0c5f71a4f9513eb81e760b03d2c"
2) "users/1-20170419212733980160/attributes/adcc32fd6ac06e7f189307a4bf1300e2"

Also the cache prefix key I set here, is not being used at all. As you can see the redis keys doesn't include that prefix.

So wondering if anyone could explain what should be cached and what not.

Thanks!

harbirg commented 7 years ago

@mustela I think the reason caching isnt working on the User model is because its dependencies are not cached. If I'm right, you would need to add cache to serializers for memberships and organizations as well.

mustela commented 7 years ago

@harbirg they have, unless there is another way to specify that. All the serializers inherit from https://github.com/mustela/ams-cache/blob/master/app/serializers/abstract_serializer.rb#L2

mustela commented 7 years ago

@bf4 the app I published has (I think) all the things you are mentioning. If you are familiar with docker, you should run the app easily. I can also generate more records or anything you need. I would really love to understand how the cache is working on AMS.

Thanks

harbirg commented 7 years ago

@mustela - I forked your repro here https://github.com/harbirg/ams-cache I switched over to Memcached as I did not have Redis installed but that should not matter. As per the logs, I see that User and Organizations model are cached and read back with a cache hit. Are you not seeing the same behaviour? Also, reading back cache was slightly slower than regenerating json response - likely because its only one user request - I put some benchmark results.

If you review the responses with both caching enabled or not, DB is accessed for both User and Organization models first. For Caching, It likely to check if the cache is dirty or not. If it is not, then readback memcached version. For non-caching case, it starts to regenerate the JSON response after DB access.

mustela commented 7 years ago

Thanks @harbirg, your tests are correct, that's what I'm seeing and as you can see, using caching is way much slower than not using it. I'm trying to put more tests/benchmarks in place to help here.

ledhed2222 commented 7 years ago

I just noticed something that seems fishy here. I have a resource and am using the Json serializer. I set things up in the model so that I would preemptively write the cache when changes are made, so that the next user request always reads from cache. After source diving it appeared that AMS adds the adapter name to the end of the cache key, which makes sense. So for example, it might look like this:

resource/id-updated_at/json

I found however that I would still get cache misses, because AMS is actually trying to read from:

resource/id-updated_at/attributes

all the time. This is happening because the Json adapter is a subclass of Attributes. Json's implementation of serializable_hash is:

def serializable_hash(options = nil)
  options = serialization_options(options)
  serialized_hash = { root => Attributes.new(serializer, instance_options).serializable_hash(options) }
  serialized_hash[meta_key] = meta unless meta.blank?

  self.class.transform_key_casing!(serialized_hash, instance_options)
end

Attributes then calls the serializer's serializable_hash method, passing itself as the adapter:

# attributes:7
serialized_hash = serializer.serializable_hash(instance_options, options, self)

#serializer:356
def serializable_hash(adapter_options = nil, options = {}, adapter_instance = self.class.serialization_adapter_instance)
  adapter_options ||= {}
  options[:include_directive] ||= ActiveModel::Serializer.include_directive_from_options(adapter_options)
  resource = attributes_hash(adapter_options, options, adapter_instance)
  relationships = associations_hash(adapter_options, options, adapter_instance)
  resource.merge(relationships)
end

#serializer: 385
def attributes_hash(_adapter_options, options, adapter_instance)
  if self.class.cache_enabled?
    fetch_attributes(options[:fields], options[:cached_attributes] || {}, adapter_instance)
  elsif self.class.fragment_cache_enabled?
    fetch_attributes_fragment(adapter_instance, options[:cached_attributes] || {})
   else
    attributes(options[:fields], true)
  end
end

#caching:220
def fetch_attributes(fields, cached_attributes, adapter_instance)
  key = cache_key(adapter_instance)
  cached_attributes.fetch(key) do
    fetch(adapter_instance, serializer_class._cache_options, key) do
      attributes(fields, true)
    end
  end
end

All this means, as far as I can tell, that the cache is always reading from the Attributes version. I wonder if that could be the/an issue?

ledhed2222 commented 7 years ago

Looks like I can ameliorate the above issue one of:

1) using the attributes adapter explicitly 2) using the attributes adapter de facto by writing the cache with ActiveModelSerializers::SerializableResource.new(resouce).as_json, which, again, actually writes to the attributes version

bf4 commented 7 years ago

@ledhed2222 Great catch! Do you think you could write a failing test for it? This was probably introduced later (by me).

Json's implementation of serializable_hash is:

def serializable_hash(options = nil)
  options = serialization_options(options)
-  serialized_hash = { root => Attributes.new(serializer, instance_options).serializable_hash(options) }
+  serialized_hash = { root => super(options) }

should be sufficient. really, the two adapters should be unified again imho.

ledhed2222 commented 7 years ago

Yeah I can work on that at some point @bf4 - thinking about this further, that issue definitely doesn't have anything to do with cache performance (obviously actually).

I will say that I've personally encountered this now. I replaced an existing route that did not use AMS and that had cache hit rates of ~0.66 and whose average request time was 250 ms with AMS. My AMS implementation has as cache hit rate of ~0.86 but the average request time actually degraded to about 450 ms.

I decided to continue using AMS for serialization, but to hit the cache myself much more directly:

serializers = user.resources.reduce({}) do |memo, resource|
  s = ActiveModelSerializers::SerializableResource.new(resource, options: options)
  memo[s.object_cache_key] = s
  memo
end

results = Rails.cache.fetch_multi(*serializers.keys) do |missing_key|
  serializers[missing_key].as_json
end.values

render json: results

By actually skipping AMS for the cache-lookup and render steps I got my average response time down to 190 ms. So - anecdotally - this looks like an issue of AMS doing too much just to compute cache keys and read them. As indicated by the first poster.

neohunter commented 3 years ago

So, as I understand the problem underlies here:

https://github.com/rails-api/active_model_serializers/blob/0-10-stable/lib/active_model/serializer/concerns/caching.rb

That cause cache computing to be more exhaustive that don't use cache at all, right? I guess it's because AMS attempts to do a cache based on the attributes set?

Questions:

1) According to cache_enabled? and perform_caching? method on that file, to disable cache would be enough to set:

ActiveModelSerializers.config.perform_caching = false

on the environment's configuration file or in an initializer, that should even overwrite the cache option defined in a Serializer.

Wouldn't be a good idea to add this to Readme.MD while this issue got fixed?

2) To fix this we would need to change the code used to check if a cache exists on that file, probably a more simple / straightforward approach, but this requires further discussion

3) And I've one strong doubt, it Its slower always when using cache? or only when using fragmented cache with only or except. I don't see how could be slower if its cache the whole serialized object.

4)

I can say that in my case caching worked like a charm. It saved me about 93% of serialization time. Using AMS 0.10.5. I'm serializing a lot of data though.

Before Caching:

After Caching:

(Images are from Skylight.io)

@mrsweaters Are you sure the After cache is faster? why it has more allocations? the more allocations it has is not the better? I wonder if first result is 1569ms or 1.5ms

rails-api / active_model_serializers