rails / rails

Ruby on Rails
https://rubyonrails.org
MIT License
55.55k stars 21.52k forks source link

Autoloaded classes raise NameError when accessed in parent's on_load hook during load #51327

Open bensheldon opened 5 months ago

bensheldon commented 5 months ago

This is not a new issue with Rails autoloading, but a challenge I feel when working with Engines (either developing engines as a gem maintainer or consuming engines as an application developer)

For example, an Engine may have a setup like this:

module GoodJob
  class ApplicationJob < ActiveJob::Base
    # some configuration
    ActiveSupport.run_load_hooks(:good_job_application_job, self)
  end

  class UtilityOneJob < ApplicationJob; end
  class UtilityTwoJob < ApplicationJob; end
  class UtilityThreeJob < ApplicationJob; end
  # ... and many more
end

I give an example of GoodJob, but this is a very analogous situation to other Engines with autoloaded classes/subclasses (models, controllers, etc.); for example Devise, which has a DeviseController (with load_hook) that is then subclassed with a desire to extend.

I want to be able to allow application developers to extend the subclasses, and I want to defer autoloaded constants as much as possible (ideally not touching Action Controller or Active Model at all until during initialization), and this seems like it should be reasonable:

# config/initializers/good_job.rb

ActiveSupport.on_load(:good_job_application_job) do
  # Customize a subclass
  GoodJob::UtilityJobOne.queue_name = "something_custom"
end

This usually works ok, except in the case when the UtilityJobOne being loaded causes ApplicationJob to be loaded for the very first time, in which case it raises a uninitialized constant UtilityJobOne (NameError). This is because of the autoloading chain:

GoodJob::UtilityJobOne first must load its parent class GoodJob::ApplicationJob for the first time, which triggers the load_hook which itself references UtilityOneJob which has not yet completed loading and thus a NameError 💥

There is what I consider a workaround for this which is to use inherited (which is triggered _after the subclass constant is loaded), but I find it somewhat complex and wonder if there is a better way, or whether this is something that should be turned into a hook pattern:

# config/initializers/good_job.rb

module ConfigureUtilityJob
  def inherited(subclass)
    super
    if subclass.to_s == "UtilityJobOne"
      subclass.queue_name = "something_custom"
    end
  end
end

ActiveSupport.on_load(:good_job_application_job) do
  GoodJob::ApplicationJob.extend ConfigureUtilityJob
end

I realize that another solution as a engine maintainer is to add load hooks to every subclass, but that seems onerous, and as an application developer I'd have to convince other engine maintainers to do so (which is totally ok, so long as there isn't a better way I'm missing or a reasonable central interface for modifying autoloaded subclasses).

rails-bot[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not been commented on for at least three months. The resources of the Rails team are limited, and so we are asking for your help. If you can still reproduce this error on the 7-2-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open. Thank you for all your contributions.

skipkayhil commented 2 months ago
module GoodJob
  class ApplicationJob < ActiveJob::Base
    # some configuration
  end
end
ActiveSupport.run_load_hooks(:good_job_application_job, GoodJob::ApplicationJob)

Does the problem go away if you move the load hook outside of the class definition so the class can finish loading first? I've noticed that Rails does some load_hooks inside and some outside class definition and maybe this is a reason that they should all be outside.

bensheldon commented 2 months ago

@skipkayhil thank you! Yeah, I think moving outside the class definition would be optimal. I think that PR could be made relatively simply (and maybe warned against doing it inside the class).

I asked that in the Discord a while back and seemed unsure:

I'm curious if there's an intention behind whether a ActiveSupport.run_load_hooks is invoked before a class definition is ended, or after. It's inconsistent: ActiveJob::Base invokes its load hook before the class definition end: https://github.com/rails/rails/blob/029d31ca31ab72df7bb79372f4ff057231fd0196/activejob/lib/active_job/base.rb#L77 ActiveRecord::Base invokes its load hook after the class definition end: https://github.com/rails/rails/blob/649e99bf9267b0cbc55decf2603519cfc9f9d4fa/activerecord/lib/active_record/base.rb#L337-L337 Invoking the load hook before the class definition end can also cause loading problems. For example, if I want to do something in an ActiveSupport.on_load(:active_job) that touches a subclass of ActiveJob::Base, it will lock up (because the subclass's definition tries to load ActiveJob::Base but which itself hasn't finished loading). It would be fairly easy to consistently move all of the ActiveSupport.run_load_hooks to after the class definition end, but I thought I'd ask first 🙏
skipkayhil commented 2 months ago

Nice, I'm glad we had the same idea!

I tried to reduce this to a minimal reproduction:

# a.rb
class A
  # B
end

# B

# b.rb
autoload :A, "./a.rb"

class B < A
end

However, when running ruby b.rb with B uncommented in either location I get the NameError. I think I'm understanding the problem better, but now I'm not sure if moving the load_hook will fix the issue. In either case, the load_hook fires while the < A is being resolved, so B never has a chance to be defined.

I realize that another solution as a engine maintainer is to add load hooks to every subclass, but that seems onerous, and as an application developer I'd have to convince other engine maintainers to do so (which is totally ok, so long as there isn't a better way I'm missing or a reasonable central interface for modifying autoloaded subclasses).

I'm not sure how blessed this is, but as an engine managed by Zeitwerk you could use loader.on_load: https://github.com/fxn/zeitwerk?tab=readme-ov-file#the-on_load-callback

Rails.autoloaders.main.on_load("GoodJob::UtilityJobOne") do |job, _abspath|
  job.queue_name = "something_custom"
end

That would potentially give you the granularity of configuring individual classes without the maintenance of defining load_hooks for each class?

bensheldon commented 2 months ago

I just opened up #52280 to move the ActiveSupport.run_load_hooks. It seems conceptually like the right thing to do, even if the reproduction is weird (I need to play with that more myself!)

Also, I remembered the trouble with Zeitwerk's loader.on_load: it doesn't get called if the constant has already been loaded.

In particular, if the target was already loaded when the callback is defined, the block won't run.

Which makes it brittle because disordered applications may load constants at weird times, and I'm not sure if my gem/engine can enforce the requirement that modification happen before Zeitwerk setup.

bensheldon commented 2 months ago

Aha, looking at my test failures, I'm realizing that there seem to be two distinct uses/purposes for Active Support load hooks:

That second usage surprised me when going through in my PR because it there area few places where mixin modules use the hook to extend the mixin itself:

https://github.com/rails/rails/blob/f2caad083cb09423dc4ec7474bd189e0227b2417/activerecord/lib/active_record/test_fixtures.rb#L40-L40