ageweke / fortitude

Views Are Code: use all the power of Ruby to build views in your own language.
MIT License
73 stars 16 forks source link

Widget caching #54

Open ajb opened 8 years ago

ajb commented 8 years ago

A slightly more updated version of #39

ageweke commented 8 years ago

@ajb — this is great!

I’ve been spending a lot of time thinking about this, and I’m currently leaning towards the idea of making caching a separate gem that layers on top of Fortitude. The reasons are subtle, but basically boil down to caching being complex (bugs are nasty, exactly what cache keys should be can be tricky, it’s easy to flood your cache if you’re not careful, etc.) and not wanting to have that load placed on Fortitude’s core.

One question I have for you: the @virtual_path stuff seems like the only part of this that wouldn’t work very cleanly in a separate gem. (This is exactly the sort of thing that around_content is for!) Can you elaborate a little more on what that’s used for? In your code it looks like it’s required to be set, and is passed properly, but nothing in the diff actually uses it — I assume that means Rails magically uses it under the covers for caching somehow?

ajb commented 8 years ago

Yeah, you've got it. Rails looks at the virtual_path in order to find the template file to digest: https://github.com/rails/rails/blob/92703a9ea5d8b96f30e0b706b801c9185ef14f0e/actionview/lib/action_view/helpers/cache_helper.rb#L204

ageweke commented 8 years ago

OK, I just went spelunking down quite the rabbit-hole. I have a question for you, and then a broader discussion:

Question: What specific case was failing for you that caused you to add the virtual_path stuff into this PR? As far as I can tell, virtual_path already gets set correctly in almost all cases; the only one where it doesn’t is when using render :widget (or possibly also render :inline) from a controller. The consequence of not having virtual_path set there is that if you actually change the code in the widget you’re rendering with render :widget, things won’t cache-bust properly, and you might still get the old cache contents. But perhaps there’s another case that I’m totally missing.

Discussion: I just discovered a giant piece of madness in Rails I didn’t even realize existed — this code. In short, it tries to use regular expressions on ERb template files to build a dependency tree for a view. The goal is to add the hashes of a view template, and every template that view template calls, recursively, to the cache digest used in the cache call. The net result is that if you use Rails’ fragment caching, and you change either the underlying view or any of the views it calls, recursively, the cache gets busted (because the key changes) and things work properly. This can matter both in development mode (most common) but also in production, if you’re (a) using a persistent store like Memcache and (b) don’t in any other way invalidate it on each deploy.

There are a couple of problems with this. Generic to Rails, this kind of “dependency tracking” is very far from perfect, since it’s using regular expressions to parse code. It’ll catch common patterns like changing a partial you call from your template, but it definitely won’t catch things like calls to helpers that are defined elsewhere, and so on. Specific to Fortitude, in order to have this sort of dependency tracking work properly, someone would need to write code that tried to parse Fortitude widgets to determine dependencies. This is probably a significantly more difficult task than it is for ERb. (Ruby syntax is more flexible than ERb, there are more ways to invoke other widgets than rendering partials, Fortitude code is going to be more highly factored — which is part of the whole point — and so on.) I think I read something about Ruby exposing its own AST to other Ruby code more easily in recent versions, so maybe that could be used? — but it still seems like quite a task.

Currently, I’m disinclined to try to tackle that whole bag of worms. I think the result would be something that would “mostly work” — which, personally, is something I kind of hate in software, because it’d work often enough you’d tend to forget about it, but, when it failed, the kinds of bugs it causes can be really hard to track down, and you wouldn’t stop to think about this as a potential cause for quite some time. The ROI just doesn’t seem to be there. It’s much cleaner just to tell people “don’t turn on caching in development, or, if you do, you have to clear the cache manually when you change templates; in production, make sure you bust the cache on every deploy”. That’s honestly something you really need to do to get 100% correct results using ERb fragment caching anyway.

Coming back to your original proposal — what I’d love to do is to sort out any remaining virtual_path issues and put that into Fortitude core. This would then make it really easy to build a fortitude-caching gem on top of the core that would add any kind of syntax anyone desires to make widgets cacheable automatically.

ajb commented 8 years ago

I agree with you about dependency tracking, and not trying to implement it for Fortitude. I think that would quickly turn into a losing battle for the reasons you mention.

I'll look into your virtual_path question, but I remember it being vital for everything that I implemented.

ajb commented 8 years ago

I'm having a lot of trouble getting my test suite to work properly, as well as getting this PR up-to-date with master, but I'm pretty convinced that the virtual_path stuff needs to be in here. If it's not set in Fortitude, where else would it be set?

ageweke commented 8 years ago

Re: test suite — let me know if I can help. I saw the issue you open/closed around bundler exec, but happy to help with anything else.

Re: virtual_path — it gets set on the ActionView::Template here. Locally, I just put a puts in CacheHelper#fragment_name_with_digest to print out the virtual_path it was using. The only way I ended up with a nil virtual_path is if I was using render :widget directly from a controller. (These were all tests with Rails 5.0.0.1.)

There is something slightly tricky that goes on here — it doesn’t change the outcome, but it’s worth understanding.

Under ERb, due to all the weird instance-variable copying that happens, @virtual_path will be set directly in a view (e.g., <pre><%= @virtual_path %></pre> will show you what it’s set to), because your ERb code gets compiled up into a method that gets splatted onto an instance of an anonymous subclass of ActionView::Base.

Under Fortitude, your widget class is much “cleaner” — it’s a completely separate object, and thus @virtual_path will be nil. However, when you call helpers (like cache), they get delegated back to that instance of an anonymous subclass of ActionView::Base, and so have access to things like @virtual_path, and thus should “just work”.

I’m not aware of any place where this actually breaks something, but it’s worth noting in case it makes a difference…

ajb commented 8 years ago

Gotcha!

I suspect then, that the issue is that I'm calling include ActionView::Helpers::CacheHelper inside of lib/fortitude/widget/caching.rb, which means that when cache is called, it's no longer delegating to the anonymous subclass, and doesn't have access to @virtual_path. Would that make sense?

ageweke commented 8 years ago

Ah, excellent. That does make sense.

You shouldn’t actually need to include CacheHelper at all — helpers should be available to you automatically in any Fortitude widget.

If you remove that line, does it actually break? If so, how?

ajb commented 8 years ago

I will have to get back to you on this, once I fix these broken tests :)

On Mon, Oct 17, 2016 at 11:51 AM, Andrew Geweke notifications@github.com wrote:

Ah, excellent. That does make sense.

You shouldn’t actually need to include CacheHelper at all — helpers should be available to you automatically in any Fortitude widget.

If you remove that line, does it actually break? If so, how?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ageweke/fortitude/pull/54#issuecomment-254298168, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNiLXU63mTLEhev9VVGgcybotSV_ovgks5q08OZgaJpZM4KBDUQ .

Adam Becker (951) 9-BECKER @AdamJacobBecker

ajb commented 8 years ago

OK, I've got my dev environment back to normal, and I just rebased this branch on top of master.

As we expected, everything works fine when I remove include ActionView::Helpers::CacheHelper.

The only question left is: when a widget is rendered via render widget:, and that widget has the cachable directive, what is the expected behavior? Should it throw an error? If so, I'm not sure how to check for the absence of @virtual_path, since only the helpers have access to it.

ageweke commented 8 years ago

OK, awesome! Glad to hear it.

Re: render :widget and cacheable…it’s really up to you. Without @virtual_path set, Rails caching does still work, and works fine; the only difference seems to be that since it doesn’t know what view you’re rendering, it doesn’t do the view-digesting magic that it otherwise does. The net result of this is that it’ll cache, but, if you change any of the code in the widget for render :widget, it won’t bust the cache, and you’ll get the old, cached view data instead.

If it were me, I’d simply accept this behavior as fine, and document it, since it’s a pretty edge case — and probably mostly applies to development mode, where view caching is disabled by default anyway.

The alternative would be to check for the presence of @virtual_path and do something different (raise an error?) if it’s not set. The easiest way to do that would be to have a caching gem add its own internal Rails helper (_fortitude_caching_get_virtual_path) that simply returns @virtual_path (this would need to be in a legit Rails helpers file, not something mixed into a Fortitude class). You could also do it by saying invoke_helper :instance_variable_get, "@virtual_path" from within a widget, which actually ought to do the same thing.

You probably have a better idea than I do what the right path is, though — you’re using render :widget much more than I ever have.

In fact, if you have a few, I’d be curious to know what causes you to use render :widget instead of separate view files — always useful to understand better what people’s use cases are.

ajb commented 8 years ago

With regard to busting the cache, I you're assuming that the cache will usually be manually cleared during a deploy? (Correct me if I'm wrong.)

With Rails fragment caching, I believe the recommended behavior is not to do that, since you want to have as warm of a cache as possible.

As for render widget:, we don't actually use it, although we do use Widget::Foo.to_html a bit. (Although now that Rails 5 has ApplicationController.render, we could use that, which would populate @virtual_path correctly.)

My main concern is that we make some mistake and use cachable when @virtual_path is nil, and we don't realize that the view isn't being digested until we start seeing stale views in production.

I think raising an error would be appropriate, and in the case that you're aware of the fact that the template cannot be digested, and want to cache anyway, you can pass skip_digest: true as an option.

I'll take a shot at creating an external gem for this.

ajb commented 8 years ago

Hmm, I'm getting an undefined method error for the cache method when rendering a widget directly, via Views::Foo.new.to_html. I can't square this with https://github.com/ageweke/fortitude/pull/43, which seems to add the Rails helpers to every view. Any idea what's up?

I was expecting this to work:

screen shot 2016-10-18 at 2 39 05 pm
ageweke commented 8 years ago

Re: your last comment — #43 doesn't actually add the Rails helpers to every view. All it does is change it so that if Fortitude’s automatic_helper_access setting is false, you should still have access to built-in Rails helpers in your widgets — just not user-defined ones.

The issue you’re running into is one I think you’ve run into in the past. In short, the only place you can use Rails helpers is in an actual view or partial, or something else rendered via a controller action. What you’re doing with Fortitude is equivalent to doing this with ERb:

puts ERB.new("Helper is: <%= distance_of_time_in_words_to_now(Time.now - 1_000_000) %>").result(binding)

If you do that, even running in a Rails console, you’ll see it fail…for the exact same reasons as it does in Fortitude. Here’s what’s going on:

Rails helpers — despite many appearances otherwise — don’t actually exist as global methods all over Rails. Instead, what happens is that when Rails gets ready to render a view, it creates this magic “context” object — an instance of an anonymous subclass of ActionView::Base, which I’m going to call the “view rendering object”. Depending on various settings (the helper_method method on a controller, the include_all_helpers flag, and so on), it then mixes in various modules full of helpers into this view rendering object. It then uses the view rendering object as the “binding” for ERb code — that is, your ERb view code gets compiled into the view rendering object and run from there. That’s why ERb views have access to those helpers. Fortitude is slightly different in that your Fortitude widget is a separate class, but there’s method_missing magic to delegate helpers automatically to the view rendering object.

Critically, it’s important to note that there is no such thing as a “global” or “generic” view rendering object, nor could there be. There are at least two reasons for this: (a) exactly which helpers end up in that object depends on the exact controller/action you’re executing; and (b) some helpers require additional context to be set on that object — like @controller, for example — to work properly. In particular, all the helpers that generate URLs (like the really-old url_for, and the newer _path, _url methods) require access to @controller to grab the current request and extract the host/protocol/port from it — this is how Rails can generate full URLs without being told explicitly what host/protocol/port it’s running on, and even work properly via proxies, multiple hostnames, etc.

Above, you’re trying to render a view without a view rendering object. Under the scenes, Fortitude notices this, and ends up using a new bare Object instead, since it has nothing else to fall back on. If you could get a hold of a view rendering object, you could easily pass one in:

Views::Test.new.to_html(Fortitude::RenderingContext.new(delegate_object: my_view_rendering_object))

…and it would all “just work”. This is just like doing in ERb:

puts ERB.new("Helper is: <%= distance_of_time_in_words_to_now(Time.now - 1_000_000) %>").result(my_view_rendering_object.send(:binding))

If you really need to render lots of widgets outside the context of a Rails controller/action, it’s probably possible to gin up a fake view rendering object that does a “good enough” job, and pass that in…but it’s never going to be perfect; for example, what would it use for that host/port on the incoming request, since there isn’t an incoming request? To me, the gotchas are too big to make Fortitude hack together one of those itself, since it’d always be a pretty gross hack of some kind.

This is very clearly something I need to put into the Fortitude FAQ — I believe jdickey ran into the exact same issues. It’s a subtle and tricky set of issues, and not at all obvious to nearly anybody since it’s almost always covered up by Rails’ internal magic.

I hope this makes sense, and outlines why you’re running into that issue. One question — what’s the reason for trying to render using a bare #to_html? That method’s really intended for use by folks not running via Rails at all…

ageweke commented 8 years ago

Re: caching and @virtual_path. I’ve been thinking about this one quite a bit.

The problem of stale views is going to be there, unfortunately, no matter what you do. Even with ERb, helper methods and various unusual ways of invoking partials (and probably other stuff, too) is going to go undetected by that (IMHO very janky) DependencyTracker, and you’re going to have cache-invalidation headaches already. With Fortitude they’re worse, because it can’t track child dependencies at all.

Ironically, I’m actually reluctant to worry about @virtual_path too much exactly because setting it only takes you from “pretty damn broken caching” to “only slightly less broken caching”. To me, a cache that I understand completely only busts when I manually change the key is actually much better than a cache that “mostly invalidates correctly, but a small fraction of the time doesn’t”. The former puts worrying about cache invalidation front and center into your hands, while the latter pretends to be magic but actually bites you in the butt in some horribly hard-to-reproduce ways (like when a critical change gets made inside a helper method, and so the tracker doesn’t pick it up).

However, when you build your gem, this can be completely under your control. All you need to do is add a cache method on some widget class or module you mix in, and it can look something like this:

def cache(*args, &block)
  raise "No @virtual_path!" unless shared_variables[:virtual_path]
  super(*args, &block)
end

…and there you go. shared_variables is a Fortitude method that gives you direct access to what you’d reference as @foo in ERb by indexing it as a Hash instead.

Last, crazy crazy crazy idea:

I realized that there’s a way of building dependency tracking for caching in Fortitude that’s actually probably at least as good as ERb, if not better, and is a lot less gross. Basically, in order for any Fortitude widget to get rendered, it needs a RenderingContext. (Methods like to_html can take one, and create an empty one if you don’t pass one in.) That object gets #start_widget!(widget) and #end_widget!(widget) called on it whenever a new widget is rendered, all the way down through partials or anything else.

So, it’d be easy enough to create a subclass of RenderingContext that simply records all of the classes of all the widgets invoked during the rendering of a view, turns that back into filenames through obvious string manipulation, and then uses that as the dependency set. Seems to me that’s a lot cleaner than using regular expressions on the source text. However, it’s still far from perfect: if, for example, a widget sometimes calls a partial Foo and sometimes doesn’t, then, depending on which invocation you’re tracing, you may or may not pick up that dependency. (In the end, I don’t remember enough of my CS Theory to prove that this is basically the halting problem and therefore impossible in the general case…but I’m pretty sure it is. ;)

ajb commented 8 years ago

Thanks for the patient explanation. (Again?)

This certainly makes sense. I'm really embarrassed that after working with Rails for years, there is still magic like this that trips me up time and time again.

Here's a v1 of the fortitude-caching gem: https://github.com/dobtco/fortitude-caching.

I was already working on this when you posted your last comment, so give me a little while to read it and digest :)

ageweke commented 8 years ago

Whoops, I got confused…I’d explained that bit of madness to other people before in different threads, but not you. My apologies! Take it as evidence that it trips up everyone. I certainly didn’t have any freaking clue that that’s how that stuff worked before I wrote Fortitude, and I’ve been doing Rails since 2008, so…yeah. Some of the guts of Rails is pretty crazy.