stimulusreflex / stimulus_reflex

Build reactive applications with the Rails tooling you already know and love.
https://docs.stimulusreflex.com
MIT License
2.28k stars 172 forks source link

ActiveRecord queries and associations are cached, leading to non-idempotent Reflexes #446

Closed leastbad closed 3 years ago

leastbad commented 3 years ago

Bug Report

Original pretense:

If you run a page morph reflex multiple times, any lazy-loaded associations in the controller action are cached, even if the data is mutated in the Reflex action. This has the unfortunate effect of making it seem like the data isn't changing, even though it is.

To Reproduce

class TasksController < ApplicationController
  def index
    @tasks = current_user.tasks
  end
end
class ExampleReflex < ApplicationReflex
  def toggle
    task = Task.find(element.dataset[:id])
    task.update(completed: (task.completed? ? false : true ))
  end
end
<% @tasks.each do |task| %>
  <%= check_box_tag(dom_id(task, :toggle), nil, task.completed?, data: { reflex: "click->Example#toggle", id: task.id }) %>
<% end %>

The issue can be mitigated by forcing the controller to re-query the association:

class TasksController < ApplicationController
  def index
    @tasks = current_user.tasks.order(:id)
  end
end

I had a hunch that this is happening because we memoize the request and controller in reflex.rb, but changing ||= to = didn't seem to make a difference.

Updated understanding

Thanks to the research done by @jonsgreen, we have a deeper understanding of what the real issue is here: ActiveRecord associations are cached by request. This works great for ActionDispatch::Request and terrible for ActionCable::Connection because it means Reflexes are not necessarily idempotent.

It seems as though there are multiple caches which need to be busted:

Query cache: https://apidock.com/rails/ActiveRecord/ConnectionAdapters/QueryCache/clear_query_cache

Association cache: https://github.com/rails/rails/blob/main/activerecord/lib/active_record/associations.rb#L328-L331

Memoized collection ids: https://github.com/rails/rails/blob/main/activerecord/lib/active_record/associations/collection_association.rb#L53

As of right now, we still don't know how to solve this in a generic way. You can force an association to reload by making it do something or force a reload, but that might be sub-optimal:

current_user.reload

Adding to the potential confusion is the fact that in some (many?) cases it could be highly desirable for the caching to persist! After all, if the API is understood well enough to program "with the grain", this could be an excellent way to speed up Reflex operations. The issue is really that it's not well understood and that it feels different compared to the classic request/response cycle.

Expected behavior

In theory, every page morph should run its own distinct controller action, without side effects from previous page morphs.

In reality, we have three choices:

It would be nice to discuss this further in the comments.

Versions

StimulusReflex

External tools

Browser

hopsoft commented 3 years ago

This is likely due to referencing objects delegated to the ActionCable connection, perhaps current_user etc...

The socket connection is persistent between reflexes; however, when a reflex message is pushed to the server from the client over the socket, the following things happen.

  1. A new reflex instance is created to respond to the message
  2. A new request is created from within the reflex instance
  3. A new controller is created from within the reflex instance

This means that the reflex, request, and controller instances are not shared between reflex invocations. The only thing shared across reflexes is the persistent connection and anything hanging off of it.

You can think of reflex handling as similar to a web request in that the environment is reconstructed to respond. There is no shared state other than the socket connection, so ActiveRecord query cache should not be an issue for anything not hanging on the connection.

julianrubisch commented 3 years ago

I was just trying to wrap my head around this when @hopsoft nailed it 👏

Anyway, now that it’s mentioned here I think I’ve met this in the past in some of my apps, but never bothered me too much.

As for mitigation strategies, I’d go for providing a manual method of flushing things, if possible. The question is, if we can get hold of all instances that are hanging off Connection, else I’d just document it

hopsoft commented 3 years ago

My vote is for documentation. Perhaps we can show folks how to initialize things like current_user on the reflex instance or maybe recommend a reload for anything hanging on the connection in a before_reflex callback.

leastbad commented 3 years ago

This makes a lot of sense! Let me say it back in my own words so I can be confident that we're all on the same page...

The only objects which persist across multiple reflexes (and presumably ActionCable callbacks, which I've never seen used in the wild) are relations + associations that are part of the Connections identified_by Set.

If a Reflex - or a controller called by a Page Reflex - happens to do queries or access associations, they won't be persisted unless they use current_user or some other accessor that has been delegated to the connection.

I think this means we should document this as a "the more you know..." item, and offer some mitigation strategies, but not actually change the implementation. @jonsgreen does this make sense to you?

jonsgreen commented 3 years ago

Thanks everyone for digging into this issue. I think that just adding to the documentation including an example of how to bust the cache with a current_user.reload would be more than sufficient.

I suppose if this becomes a common problem that people start reporting more frequently then as others have suggested the library could by default and/or with configuration reload all ActiveRecord objects that are part of the identified_by Set defined here: https://github.com/rails/rails/blob/main/actioncable/lib/action_cable/connection/identification.rb#L21

leastbad commented 3 years ago

Believe me, I thought about it!

However, now that we know what's happening and how to fix it, I sort of like that we can shift from seeing this as a bug to a feature. It means that the most frequently referenced accessor has the option to take advantage of cached results.

It's entirely reasonable to consider this a performance optimization that you can opt-out of at any time.

I have updated the docs: https://docs.stimulusreflex.com/rtfm/reflex-classes#queries-and-associations-are-cached