FEATURE REQUEST: Use Rails.cache to cache api_formatted objects

coneybeare commented 13 years ago

The idea is that I would be able to fine tune a config block in the model to use memcache, expiration time, etc…

When the renderer renders the model instance as json, xml or whatever, the rendered element gets cached so that if it appears again on some other api call, it is there.

For simple models this is not necessary and could even slow things down, but on more complex models, where the api is calling methods that may induce joins or database calls, this can be a life saver.

I think that some automatic key should be generated such as apitemplate#{template}model#{class.name}id#{instance.id} and use that to query.

What do you think?

fabrik42 commented 13 years ago

Hey Matt,

a couple thoughts on this:

I think usually caching should be a task for controllers, not models. So we should think about the way to enable/use caching together with api templates (in the template declaration or the renderer).

But generally I think this is a good idea. We could start with equipping the Hash returned by model.as_api_response with a cache_key method (used by Rails to determine the cache key). Somehow like this:

response_hash = api_attributes.to_response_hash(self)
response_hash.instance_eval do
  def cache_key
    "#{class.to_s}_#{api_template.to_s}_{id}"
  end
end

bennytheshap commented 13 years ago

I don't know if github allows this, but I'd totally vote for it.

coneybeare commented 13 years ago

I like this, as long as we are inserting the cache key to each record, not just the entire returned structure. This is because someone might have list1, list2, and list3 with common elements, and those common element should only be fetched once. I am on Vacation now, but can work with you on this when I get back in a few weeks

fabrik42 commented 13 years ago

I fear this will get tricky as acts_as_api is not meant to be responsible for gathering the data, just rendering it.

But it's a really cool idea, please let me know when you have time to work on it so I can join you?

coneybeare commented 13 years ago

You brought up some good points that are making me second guess trying to put this in. For myself, I would probably just hack it to get it to work for my situation, not necessarily keeping the M and C (of MVC) separate. But putting this kind of hack into a published gem doesn't seem like a good idea :(

coneybeare commented 13 years ago

Thought a little about this. While caching complete methods and returns should definitely be controller based, I don't think there is necessarily anything wrong with a model caching itself. This is only similar how rails loads an ActiveRecord into memory instead of working directly out of the DB the entire time. Thus, putting logic into a model to cache itself, and only itself, for ease of reading later isn't so bad.

I propose we add an option to api_accessible called :cache which when set to true (default is false) would make the as_api_response check the Rails.cache first and return the value if hit, else, generate the response and cache using the specified key format. If we want to get fancy, we can give an option for a cache_key_prefix to eliminate collisions.

Thoughts?

fabrik42 commented 13 years ago

Sounds good. Maybe something like this:

You can define default caching behaviour as you proposed in the model

api_accessible :v1_default, :cache => 10.minutes do |t| 
  t.add :id
  t.add :name
end

But I think you should also be able to control the cache in the controller

format.json { render_for_api :v1_default, :json => @users, :root => :users, :api_cache => 2.minutes }

E.g. to clear the cache on #update

format.json { render_for_api :v1_default, :json => @users, :root => :users, :api_cache => false }

benedikt commented 13 years ago

I don't think caching should be part of acts_as_api at all. You could always cache the bare objects or use action/page caching...

bennytheshap commented 13 years ago

Where this gets tricky is with associations. Any ideas about how to handle that?

On Jun 6, 2011, at 9:06 AM, fabrik42 reply@reply.github.com wrote:

Sounds good. Maybe something like this:

You can define default caching behaviour as you proposed in the model
api_accessible :v1_default, :cache => 10.minutes do |t|
 t.add :id
 t.add :name
end
But I think you should also be able to control the cache in the controller
format.json { render_for_api :v1_default, :json => @users, :root => :users, :api_cache => 2.minutes }
E.g. to clear the cache on #update
format.json { render_for_api :v1_default, :json => @users, :root => :users, :api_cache => false }
Reply to this email directly or view it on GitHub: https://github.com/fabrik42/acts_as_api/issues/23#comment_1309355

coneybeare commented 13 years ago

@benedikt When looking at this from an api perspective, let me give you a use case for when this is particularly useful for me.

I have a search api for my app ambiance which is searching sounds, rendering an api response that calls in ratings (millions of rows join table). A search for "Rai" and "Rain" will turn up many of the same results, and are usually called right after each other as a predictive search. These results that turn up greatly benefit from caching the response and is essentially action caching, but at a level that makes sense for an API call.

After optimizing the DB calls to the max and profiling the code to remove any issues, the single bottle neck is the join call to get the rating field for the api. I am left with 2 options:

1) Introduce data redundancy in the tables by caching the rating field in the sound 2) Caching a response as indicated above so the next search call for this model will not have to do the same join on the ratings, only to produce the same result.

I like number 2 better. What else would you suggest?

coneybeare commented 13 years ago

@bennytheshap I don't see how it would matter as the caching is only caching the generated json or xml for an object. Nothing about how the xml or json is generated would change at all. As with any caching scheme, you will probably need to implement logic to detroy/update cached values when certain events trigger. Can you explain your concern a little more?

bennytheshap commented 13 years ago

My concern is that if I have a class A that is associated with class B (and elements of B are included in the API response for A), then if A's cache time is 10 minutes and B's is 5 minutes, should the 5 minutes win? Could A cache itself in parts so that it can avoid re serializing most of itself but then re-json B and stick it in?

On Mon, Jun 6, 2011 at 9:28 AM, coneybeare < reply@reply.github.com>wrote:

@bennytheshap I don't see how it would matter as the caching is only caching the generated json or xml for an object. Nothing about how the xml or json is generated would change at all. As with any caching scheme, you will probably need to implement logic to detroy/update cached values when certain events trigger. Can you explain your concern a little more?

Reply to this email directly or view it on GitHub: https://github.com/fabrik42/acts_as_api/issues/23#comment_1309493

benedikt commented 13 years ago

@coneybeare Try to store the result of the rating field in an instance variable so it gets serialized during the caching process. Then after you get the collection of sound objects from database iterate them and Rails.cache.fetch while making sure the rating gets calculated once in the block of Rails.cache.fetch.

fabrik42 commented 13 years ago

While I really understand the need of caching, I'm still not 100 percent convinced if acts_as_api is the right place to do this. As I mentioned before, acts_as_api is not meant to be responsible for collecting data, just rendering it.

I like the idea in general, but I fear that we implement something that won't fit the needs of any other user of the lib. As you can see in this thread, there are a lot of different caching strategies :) So maybe we should leave it up to the developer as caching a response from acts_as_api should be pretty simple.

@bennytheshap: I don't think this would be a good idea, because a) it would get pretty complicated and b) acts_as_api is orm agnostic. It does not know about associations, it just handles ruby methods. And caching every sub template on its own could lead to lots overhead (esp. if you don't have any other use for it).

coneybeare commented 13 years ago

@fabrik42 I agree that not everyone will benefit from this. Perhaps, instead of caching within the gem, we provide model hooks before and after the rendering. This way, users can do whatever they want without affecting other users of the gem.

For example, we can provide a before hook that asks if rendering should proceed. In that callback, I would check the cache for a key for the model instance. If found, I return it and the rendering is not necessary. If nil, the rendering continues and an after hook is called, letting the user cache or do whatever after the rendering has completed.

Thoughts on this approach?

On Jun 6, 2011, at 11:03 AM, fabrik42reply@reply.github.com wrote:

While I really understand the need of caching, I'm still not 100 percent convinced if acts_as_api is the right place to do this. As I mentioned before, acts_as_api is not meant to be responsible for collecting data, just rendering it.

I like the idea in general, but I fear that we implement something that won't fit the needs of any other user of the lib.

@bennytheshap: I don't think this would be a good idea, because a) it would get pretty complicated and b) acts_as_api is orm agnostic. It does not know about associations, it just handles ruby methods. And caching every sub template on its own could lead to lots overhead (esp. if you don't have any other use for it).

Reply to this email directly or view it on GitHub: https://github.com/fabrik42/acts_as_api/issues/23#comment_1309712

bennytheshap commented 13 years ago

I like it.

On Mon, Jun 6, 2011 at 10:28 AM, coneybeare < reply@reply.github.com>wrote:

@fabrik42 I agree that not everyone will benefit from this. Perhaps, instead of caching within the gem, we provide model hooks before and after the rendering. This way, users can do whatever they want without affecting other users of the gem.

For example, we can provide a before hook that asks if rendering should proceed. In that callback, I would check the cache for a key for the model instance. If found, I return it and the rendering is not necessary. If nil, the rendering continues and an after hook is called, letting the user cache or do whatever after the rendering has completed.

Thoughts on this approach?

On Jun 6, 2011, at 11:03 AM, fabrik42reply@reply.github.com wrote:

While I really understand the need of caching, I'm still not 100 percent convinced if acts_as_api is the right place to do this. As I mentioned before, acts_as_api is not meant to be responsible for collecting data, just rendering it.

I like the idea in general, but I fear that we implement something that won't fit the needs of any other user of the lib.

@bennytheshap: I don't think this would be a good idea, because a) it would get pretty complicated and b) acts_as_api is orm agnostic. It does not know about associations, it just handles ruby methods. And caching every sub template on its own could lead to lots overhead (esp. if you don't have any other use for it).

Reply to this email directly or view it on GitHub: https://github.com/fabrik42/acts_as_api/issues/23#comment_1309712

Reply to this email directly or view it on GitHub: https://github.com/fabrik42/acts_as_api/issues/23#comment_1309867

coneybeare commented 13 years ago

@benedikt What you suggest still doesn't solve the problem of rendering the same model multiple times. This is the issue we are trying to solve here, and where to put the logic or hooks to make this happen. I only gave a single example of how performance might be increased in a certain scenario. At least for my app, ruby-prof tells me that the majority of the time spent in an API call is in the rendering of the model, and thus I seek ways to minimize this across thousands of calls.

benedikt commented 13 years ago

@coneybeare You're right. But based on what you described above I assume it's not the rendering itself that is slow, but the method calls to your model performed by the rendering process?

coneybeare commented 13 years ago

@benedikt Yes that is true, it is the method calls. I guess it is preference… I would rather cache the entire rendered structure than cache the individual slow parts of it separately. To me, it seems cleaner, executes less code, uses less code to write and opens up fewer places to introduce error

coneybeare commented 13 years ago

I've got something working locally, but I have never implemented a callback method from a gem before so I ask the community if there is a better way. Basically, there are two default methods that can be overwritten in each model to do stuff with

      # Creates the api response of the model and returns it as a Hash.
      # Will raise an exception if the passed api template is not defined for the model
      def as_api_response(api_template)

        if response_hash = prerendered_api_response(api_template.to_s)
          return response_hash 
        end

        api_attributes = self.class.api_accessible_attributes(api_template)
        raise ActsAsApi::TemplateNotFoundError.new("acts_as_api template :#{api_template.to_s} was not found for model #{self.class}") if api_attributes.nil?

        api_response_rendered api_template.to_s, api_attributes.to_response_hash(self)        
      end

      protected

      # Callback to overwrite if rendering should proceed or not.
      # The input is the api_template name
      # Return nil if as_api_response should render, otherwise return the prerendered content
      def prerendered_api_response api_template
        nil
      end

      # Callback to overwrite if you want to do something with the response_hash before returning.
      # The input is the api_template name and the rendered_hash
      # Manipulate or do something with it, then return it when done
      def api_response_rendered api_template, api_response
        api_response
      end

I use it like this (not part of the gem):

  def prerendered_api_response api_template
    if ExampleServer::Application.config.action_controller.perform_caching && api_template
      Rails.cache.read "api_response_#{self.class.to_s}_#{id}_#{api_template.to_s}"
    end
  end

  def api_response_rendered api_template, api_response
    if ExampleServer::Application.config.action_controller.perform_caching && api_response
      Rails.cache.write "api_response_#{self.class.to_s}_#{id}_#{api_template.to_s}", api_response, :expires_in => 1.day
    end
    api_response
  end

fabrik42 commented 13 years ago

We also could try to implement it using the active model callbacks. (http://api.rubyonrails.org/classes/ActiveModel/Callbacks.html#method-i-define_model_callbacks)

But the problem here is that they don't support custom arguments to be passed to the callback functions.

coneybeare commented 13 years ago

@fabrik42 Yeah, i think we need the args because there can be many different api_templates. We also need a way to prevent execution of the method and I am not sure that can be accomplished with the active model callbacks. This is your decision here… I wan't to know which way you think is best before i give you a pull request

fabrik42 commented 13 years ago

Ok, a couple thoughts:

I'd call it before_api_response and after_api_response, even though it is not a real callback but more kind of filter I think it's still easier to understand + "render" in a model does not sound so good imho ;)
I would not predefine the methods. Instead I would use respond_to? to check if they exist.
I'd move the possible ActsAsApi::TemplateNotFoundError exception before the "before" method, to prevent that something is returned even though the api template does not exist anymore. In this example acts_as_api would deliver results when the cache exists while the api template was already removed.
If all of this works, would you mind to write a wiki entry about your caching?

coneybeare commented 13 years ago

I can write it up. The before_api_response is a little misleading though because if it returns anything other than nil, that content will be served instead of the real render. Perhaps we add three methods:

before_api_response: just a hook before, no stopping of execution
existing_api_response (or similarly named): Method that returns nil to proceed, anything else to serve that as well
after_api_response: just a hook before, no stopping of execution

fabrik42 commented 13 years ago

You're right about the potential mislead of the method names combined with this behaviour.

I think it should be before_api_response -> Only hook around_api_response -> Like the around callbacks or ActiveModel. You can do stuff before and after the execution of as_api_response and you have to yield it yourself (so you can also return something else in fact). after_api_response -> Only hook

coneybeare commented 13 years ago

I have been hacking away at it but I think the around documentation is pretty poorly written. I asked a SO question and will see what people say: http://stackoverflow.com/questions/6299001/how-can-i-use-custom-callbacks-on-an-active-model

fabrik42 commented 13 years ago

I think it should work something like this:

def as_api_response(api_template)
  api_attributes = self.class.api_accessible_attributes(api_template)
  raise ActsAsApi::TemplateNotFoundError.new("acts_as_api template :#{api_template.to_s} was not found for model #{self.class}") if api_attributes.nil?

  before_api_response(api_template) if respond_to? :before_api_response

  response_hash = if respond_to? :around_api_response
    around_api_response api_template do
      api_attributes.to_response_hash(self)
    end
  else
    api_attributes.to_response_hash(self)
  end

  after_api_response(api_template) if respond_to? :after_api_response

  response_hash
end

Then you could implement a around_api_response method like this:

def around_api_response(api_template)
  # do something
  # e.g. return something else instead

  # call the real as_api_response
  resp = yield

  # do something else
  return resp  
end

This is pretty similar to the usage of rails around callbacks

coneybeare commented 13 years ago

I figured I could do it this way, but was trying to use the official ActiveSupport way so as to support flexibility of different method names, lambdas, procs and blocks. I will dig in and figure it out.

On Jun 9, 2011, at 1:35 PM, fabrik42reply@reply.github.com wrote:

You're right about the potential mislead of the method names combined with this behaviour.

I think it should be before_as_api_response -> Only hook around_as_api_response -> Like the around callbacks or ActiveModel. You can do stuff before and after the execution of as_api_response and you have to yield it yourself (so you can also return something else in fact). after_as_api_response -> Only hook

Reply to this email directly or view it on GitHub: https://github.com/fabrik42/acts_as_api/issues/23#comment_1335806

fabrik42 commented 13 years ago

I already did this, but the problem is that the official callbacks don't support parameters.

For me it seems like run_callbacks(kind, *args, &block) once was meant to accepts args https://github.com/rails/rails/blob/master/activesupport/lib/active_support/callbacks.rb#L80

But the __define_runner method creates the callback methods in a way that they only accept one parameter (for per-key conditions) https://github.com/rails/rails/blob/master/activesupport/lib/active_support/callbacks.rb#L372

So if we want to use the official Callback methods we can't pass parameters. Maybe we need to save them in an instance var? I don't like it, but it would solve the problems...

coneybeare commented 13 years ago

Submitted a pull request

fabrik42 / acts_as_api

FEATURE REQUEST: Use Rails.cache to cache api_formatted objects #23