active-hash / active_hash

A readonly ActiveRecord-esque base class that lets you use a hash, a Yaml file or a custom file as the datasource
MIT License
1.19k stars 178 forks source link

Accessing belongs_to association without a key creates a useless query #302

Open Catsuko opened 5 months ago

Catsuko commented 5 months ago

Environment

Problem

When accessing a belongs_to association, if the related id is missing then a query for id IS NULL is made. This is different from the standard ActiveRecord behaviour and I think in 99% of cases a redundant query. Seems to be caused by the internal use of find_by_id with a blank argument: https://github.com/active-hash/active_hash/blob/master/lib/associations/associations.rb#L177

You can reproduce and observe with this script:

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  git_source(:github) { |repo| "https://github.com/#{repo}.git" }

  gem "rails", "7.1.1"
  gem "sqlite3"
  gem "active_hash", "3.2.1"
end

require "active_hash"
require "active_record"
require "logger"

ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: "db/development.sqlite3")
ActiveRecord::Schema.define do
  create_table :authors, force: true
  create_table :magazines, force: true do |t|
    t.belongs_to :author, null: true
  end
end

class Author < ActiveRecord::Base
end

class Book < ActiveHash::Base
  include ActiveHash::Associations

  fields :author_id

  belongs_to :author
end

class Magazine < ActiveRecord::Base
  belongs_to :author, optional: true
end

book = Book.new(author_id: nil)
magazine = Magazine.create!(author_id: nil)

ActiveRecord::Base.logger = Logger.new(STDOUT)

puts "-" * 50
puts "Accessing ActiveHash association with NULL key"
book.author
puts "-" * 50
puts "Accessing ActiveRecord association with NULL key"
magazine.author
puts "-" * 50

Potential Solution

Don't bother querying if the key is NULL, if there is some weird edge case where you would want this then perhaps the behaviour could be enabled with a config option or optional argument when defining the association.

kbrock commented 3 months ago

So the request is to throw an error if the model does not have a fields :author_id or it does not respond_to?(:author_id)?

Wonder if this will cause an issue if a model defines associations before fields. (I almost think it would be a reasonable "breaking" change)

Catsuko commented 2 months ago

@kbrock

So the request is to throw an error if the model does not have a fields :author_id or it does not respond_to?(:author_id)?

No, it is that when the association foreign key is nil then just do a noop instead of calling find_by_X and making a useless db query. ActiveRecord associations will not create a database query in this case however active hash associations will: https://github.com/rails/rails/blob/main/activerecord/lib/active_record/associations/association.rb#L290-L304

Hope that makes sense 🙏

kbrock commented 2 months ago

Ugh. Your test script is pretty clear here. Sorry. (see the last query below)

I do like how rails avoids this query. I am a little concerned that people will want to query for a nil id. Yes, Rails only supports models with a unique present id, but since when have people followed strict Rails?

irb(main):004:0> Vm.find_by(:id => nil)
  Vm Load (2.2ms)  SELECT "vms".* FROM "vms" WHERE "vms"."id" IS NULL LIMIT $1  [["LIMIT", 1]]
=> nil
irb(main):005:0> VmOrTemplate.find(nil)
.../activerecord-6.1.7.7/lib/active_record/relation/finder_methods.rb:456:
in `find_with_ids': Couldn't find Vm without an ID (ActiveRecord::RecordNotFound)
irb(main):014:0> vm = VmOrTemplate.first ; nil
  VmOrTemplate Load (0.5ms)  SELECT "vms".* FROM "vms" ORDER BY "vms"."id" ASC LIMIT $1  [["LIMIT", 1]]
=> nil
irb(main):015:0> vm.ext_management_system ; nil
  ExtManagementSystem Load (1.1ms)  SELECT "ext_management_systems".*
  FROM "ext_management_systems" WHERE "ext_management_systems"."id" = $1 LIMIT $2 [["id", 3], ["LIMIT", 1]]
  ExtManagementSystem Inst Including Associations (0.2ms - 1rows)
=> nil
irb(main):016:0> vm = Vm.where(:ext_management_system => nil).first ; nil
  Vm Load (0.8ms)  SELECT "vms".* FROM "vms" WHERE "vms"."ems_id" IS NULL
  ORDER BY "vms"."id" ASC LIMIT $1  [["LIMIT", 1]]
=> nil
irb(main):017:0> vm.ext_management_system
=> nil
Catsuko commented 2 months ago

I do like how rails avoids this query. I am a little concerned that people will want to query for a nil id. Yes, Rails only supports models with a unique present id, but since when have people followed strict Rails?

Sadly I agree 😭 My next thought was an opt-in option on the association method to enable this behaviour. Seems clunky and it would be hard naming it but then at least it doesn't break existing usage.