Closed fxn closed 5 years ago
Also, we have to think that bootsnap
does not own require
, there might be other decorations upwards. Zeitwerk for example also decorates require
. I believe it does so after bootsnap
, but other code may have done it before.
Another idea that could be perhaps related, is that perhaps bootsnap
could ignore calls with absolute paths. Zeitwerk internally works only with absolute paths.
Not exactly a proposal, you know better the principles behind bootsnap
, just an idea that came to mind while looking into this.
Well, to be exact, Zeitwerk sets autoloads using only absolute paths, and preloads/eager loads using absolute paths. Projects managed by Zeitwerk do not even need to be in $LOAD_PATH
, it just works if they are not by design.
But if require
gets a relative path it still tries to play well with it in case client code of gems using Zeitwerk issues some.
That is undocumented, it is an attempt to do as much as possible to work seamlessly.
cc @burke
Just to throw another option, perhaps the integration of Zeitwerk in Rails could also invalidate the bootsnap
cache.
Zeitwerk has reloading implemented, but it has to be triggered by client code. In the case of Rails, the framework is responsible for knowing when to reload, and when that happens it knows that it has to call Rails.autoloader.reload
and do so in a thread-safe way. Similarly, that integration point could tell bootsnap
to invalidate its cache. That could make sense, because files have changed, some may not even exist and a require
call should err instead of returning false
.
Probably, that invalidation should affect also the stable cache, because since engines are in autoload paths, they are reloaded too even if they are gems. I would like to change this in the future, but today that is the way it is.
As someone not familiar with bootsnap
, I still believe that delegating to require
with an absolute path to preserve semantics (ie, a file has been deleted, require
has been decorated, etc.) is an option that sounds good.
Maybe even both things would be the complete solution, because if the file system has changed and it turns out a/foo.rb
has been removed and at the same time b/foo.rb
has been created and a
and b
are in the load paths, foo
should resolve to a different absolute path.
In summary, a Rails programmer that changes the file system expects the changes to be available right away.
Only trying to give ideas and explain the uses cases. You guys are in a better position to decide which is the approach that squares better with bootsnap
design and trade-offs.
Hm, this is likely going to be tricky.
First point, we do some rather gross monkeypatching of the ActiveSupport autoloader (https://github.com/Shopify/bootsnap/blob/master/lib/bootsnap/load_path_cache/core_ext/active_support.rb). I'm not sure how much of this becomes irrelevant with, or conflicts with, Zeitwerk.
I'm not, in principle, opposed to making some minor performance compromises here to accommodate Zeitwerk.
It's really too bad that $LOADED_FEATURES
is modified from C-land, so we can't simply hook all the cases where it's mutated like we're able to with $LOAD_PATH
; that said, we should be able to hook whichever method Zeitwerk is using to mutate it, and probably respond accordingly to that manipulation.
Looking at https://github.com/Shopify/bootsnap/blob/2390a1a51cc8922ded8ae7aab3f8cf39cdde5847/lib/bootsnap/load_path_cache/loaded_features_index.rb, I'm noticing that we only currently track e.g. foo
out of /a/b/c/foo.rb
, so it would be hard to respond correctly to a $LOADED_FEATURES.delete
. We could plausibly correct this by doing something like:
@lfi[short] = full_feature.hash
@lfi[strip_extension(short)] = full_feature.hash
...then when deleting a feature, we could scan the index for suffixes of that path whose entry is the hash of that full path, pruning them as we go.
This would be a bit expensive, so we'd want to weigh how often this operation would happen versus the additional cost of maintaining a reverse index of the full feature path.
Hey @burke!
The integration is still WIP, but it could be the case that the Active Support monkey patches in bootsnap
do not interfere. The way Zeitwerk is being integrated is basically by calling take_over
from this module at some point in the initialization process. (Warning: I am force pushing to that branch.)
As you see, it is a quite lightweight decoration to preserve a subset of the interface that I have considered worth maintaining by now. The key observation that simplifies this integration is that Kernel#autoload
is called before const_missing
. Therefore, for a correct project structure, the autoloading code of AS::Dependencies won't ever run.
It could be the case that I unhook also AS::Dependeces in take_over
, just to play safe. And it could be the case that instead of bootsnap/setup
we generate a setup that disables bootsnap
's monkey patch (from memory I believe that happens if you disable auto_load_paths
). All is speculative right now.
But bootsnap
worked as is in a simple application, the only think that didn't work was reloading, for the reasons we are studying here. At least by now, maybe other things may be waiting to be discovered, but by now that is what I've seen.
Zeitwerk only calls $LOADED_FEATURES.delete
on reload. A reload is triggered by Rails when code changes, or when you call reload!
in the console, for example. You know. Anything that invokes AS::Dependencies.clear
. Maybe that helps leaning on a particular approach?
Maybe Zeitwerk could broadcast when reloading happens and it thus mutates $LOADED_FEATURES or similar like that? Instead of preying on the mutation.
Well, Zeitwerk itself is agnostic in that sense, it just offers reload
in case client code is a serivce that monitors the file system and wants to refresh the code.
But the Rails integration could call once a method that invalidates bootsnap
cache, for example:
def clear
Dependencies.unload_interlock do
Rails.autoloader.reload
Bootsnap.rescan
end
end
That is what I meant above by saying
Similarly, that integration point could tell
bootsnap
to invalidate its cache.
I have to say that as far as require
is concerned bootsnap
is not necessary for projects managed by Zeitwerk, because the library directly issues require
calls with absolute paths scanning once. But if Rails ships both of them, at least they should be able to coexist.
Regarding the last paragraph, if bootsnap
had a way to say "do not cache anywthing volatile, only stable", that could also be a possibility for Rails integration maybe.
Let me summarize the different options so far and what I believe are the cons of each one:
Provide API to rescan on file system changes: Costly and with little benefit. Zeitwerk carefully lazy scans the app tree precisely thinking about the cost of a tree traversal in Shopify's code base (discussed with @rafaelfranca). A rescan from bootsnap
would defeat that optimization. Additionally, it would do so with little benefit, since code managed by Zeitwerk does not issue require
calls with relative paths and bootsnap
cache for relative paths won't be leveraged.
Let require
call the parent implementation with the cached absolute path instead of returning a hard-coded false
for loaded files: The cache may be stale.
Provide API to ignore so-called volatile paths: Since bootsnap
would only scan a subset of $LOAD_PATH
, it could resolve to a wrong filename if the relative path should resolve to something outside the stable paths due to a (maybe deliberate) name coincidence.
Ignore require
calls with absolute paths totally, do not cache, always pass them up: The user may exceptionally call require
or require_dependency
for whatever reason with a relative path, and we are at square one.
~1.4.0.pre2~ 1.4.0.pre3 is pushed to rubygems, should work for you I think.
I believe this is working now, let's close.
Right now, the decorated
require
trusts its cache:So it does not preserve
require
semantics:I've seen some comments in the code base where it is clear this is a known trade-off rather than an overlook.
I am raising this issue because Zeitwerk depends on
require
semantics in the sense thatrequire
checks whatever$LOADED_FEATURES
has at the moment. It needs that behavior because that is the trick it uses to have new autoloads load the same file again on reload, Zeitwerk removes the files from$LOADED_FEATURES
.I don't know
bootsnap
enough to think about how we could make both things play well together, but something that comes to mind is that if the cache has the file, then just callrequire
with the corresponding absolute path. That is a trade-off, there is an extra$LOADED_FEATURES
lookup, but it is complaint.What do you think?