activerecord-hackery / ransack

Object-based searching.
https://activerecord-hackery.github.io/ransack/
MIT License
5.66k stars 799 forks source link

Duplicate (exponential) LEFT OUTER JOINS #1250

Open stevenjonescgm opened 3 years ago

stevenjonescgm commented 3 years ago

Ransack 2.4.2 (and 2.3.2, likely back to adding Rails 5.0 support b/c left_outer_joins changes) has an issue with failing to reuse pre-existing LEFT OUTER JOINS.

I'd hope that Postgres would optimize out the unused redundant joins, but at least in PG 12.7 it causes an exponential or possible infinite recursion consumption of time and Temp Space (until pg_timeout or temp space is exhausted).

I ran into this issue when tokenizing a search string into multiple words then running the terms through Ransack multiple times so I could generate a "match all terms, across multiple columns" like

base = Table.all
"one two".split.each do |term|
  base = base.ransack(rows_type_cont: term, row_title_cont: term, m: 'or').result
end
# "one two" => WHERE (rows.type = 'one' OR rows.title = 'one') AND (rows.type = 'two' OR rows.type = 'two')

I happened to notice the Ransack option for reusing shared context, so I tried that, but this ends up only including the final term. Not sure why this didn't work as expected and whether this was close to working or simply avoided the bug by practically resetting the context until the final term.

base = Table.all
context = Ransack::Context.for(base)
"one two".split.each do |term|
  base = base.ransack({ rows_type_cont: term, row_title_cont: term, m: 'or' }, { context: context }).result
end

I thought maybe the Ransack::Search.base was being replaced instead of modified, but didn't find a working solution.

In the end, I found an alternative solution: using build_grouping instead of context, like

base = Table.all.ransack # implicit AND
"one two".split.each do |term|
  base = base.build_grouping({ rows_type_cont: term, row_title_cont: term, m: 'or' })
end
base.result

Test Gist

Since I found a work-around, I stopped at a more integration-than-unit case:

# ala adding to spec/ransack/adapters/active_record/context_spec.rb:102

        describe '#join_dependency or likely #build_joins' do
          let(:expected_inner_join) {<<~SQL.gsub("\n", '') }
              SELECT "people".* FROM "people"
               INNER JOIN "articles" ON "articles"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
               WHERE "articles"."title" = 'Bar' AND "articles"."title" = 'Foo'
               ORDER BY "people"."id" DESC
          SQL
          let(:expected_left_outer_join) {<<~SQL.gsub("\n", '')}
              SELECT "people".* FROM "people"
               LEFT OUTER JOIN "articles" ON "articles"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
               WHERE "articles"."title" = 'Bar' AND "articles"."title" = 'Foo'
               ORDER BY "people"."id" DESC
          SQL

          it 'should reuse existing inner joins' do
            # For unknown reasons, Search.new uses Context.for_class instead of Conext.for_object as .ransack() does
            # This loses the .joins and .where (etc)
            base_persons = Person.joins(:articles).where(articles: {title: 'Bar'})
            sql = Search.new(base_persons, { articles_title_eq: 'Foo' }).result.to_sql
            expect(sql).to match(expected_inner_join), -> { sql_mismatch_message(expected_inner_join, sql) }
          end

          it 'should reuse existing inner joins' do
            base_persons = Person.joins(:articles).where(articles: {title: 'Bar'})
            sql = base_persons.ransack({ articles_title_eq: 'Foo' }).result.to_sql
            expect(sql).to match(expected_inner_join), -> { sql_mismatch_message(expected_inner_join, sql) }
          end

          it 'should reuse existing left_outer_joins' do
            base_persons = Person.left_outer_joins(:articles).where(articles: {title: 'Bar'})
            sql = base_persons.ransack({ articles_title_eq: 'Foo' }).result.to_sql
            expect(sql).to match(expected_left_outer_join), -> { sql_mismatch_message(expected_left_outer_join, sql) }
          end

          it 'should reuse existing joins from ransack' do
            result = Person.ransack({ articles_title_eq: 'Bar' }).result
            result = result.ransack({ articles_title_eq: 'Foo' }).result
            sql = result.to_sql
            expect(sql).to match(expected_left_outer_join), -> { sql_mismatch_message(expected_left_outer_join, sql) }
          end
        end

        def sql_mismatch_message(expected, actual)
          <<~ERROR
              expected:
                #{expected.gsub(/(FROM|INNER|LEFT|WHERE|ORDER BY)/, "\n  \\1")}
              actual:
                #{actual.gsub(/(FROM|INNER|LEFT|WHERE|ORDER BY)/, "\n  \\1")}
          ERROR
        end
?1 [21-09-14 16:52:32] % rspec ./spec/ransack/adapters/active_record/context_spec.rb
====================================================================================
Running Ransack specs with SQLite, Active Record 6.0.4.1, Arel 10.0.0 and Ruby 2.6.6
====================================================================================
..........F.FF..

Failures:

  1) Ransack::Adapters::ActiveRecord::Context#join_dependency or likely #build_joins should reuse existing inner joins
     Failure/Error: expect(sql).to match(expected_inner_join), -> { sql_mismatch_message(expected_inner_join, sql) }

       expected:
         SELECT "people".*
         FROM "people"
         INNER JOIN "articles" ON "articles"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
         WHERE "articles"."title" = 'Bar' AND "articles"."title" = 'Foo'
         ORDER BY "people"."id" DESC
       actual:
         SELECT "people".*
         FROM "people"
         LEFT OUTER JOIN "articles" ON "articles"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
         WHERE "articles"."title" = 'Foo'
         ORDER BY "people"."id" DESC
     # ./spec/ransack/adapters/active_record/context_spec.rb:121:in `block (3 levels) in <module:ActiveRecord>'

  2) Ransack::Adapters::ActiveRecord::Context#join_dependency or likely #build_joins should reuse existing left_outer_joins
     Failure/Error: expect(sql).to match(expected_left_outer_join), -> { sql_mismatch_message(expected_left_outer_join, sql) }

       expected:
         SELECT "people".*
         FROM "people"
         LEFT OUTER JOIN "articles" ON "articles"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
         WHERE "articles"."title" = 'Bar' AND "articles"."title" = 'Foo'
         ORDER BY "people"."id" DESC
       actual:
         SELECT "people".*
         FROM "people"
         LEFT OUTER JOIN "articles" ON "articles"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
         LEFT OUTER JOIN "articles" "articles_people" ON "articles_people"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
         WHERE "articles"."title" = 'Bar' AND "articles"."title" = 'Foo'
         ORDER BY "people"."id" DESC
     # ./spec/ransack/adapters/active_record/context_spec.rb:133:in `block (3 levels) in <module:ActiveRecord>'

  3) Ransack::Adapters::ActiveRecord::Context#join_dependency or likely #build_joins should reuse existing joins from ransack
     Failure/Error: expect(sql).to match(expected_left_outer_join), -> { sql_mismatch_message(expected_left_outer_join, sql) }

       expected:
         SELECT "people".*
         FROM "people"
         LEFT OUTER JOIN "articles" ON "articles"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
         WHERE "articles"."title" = 'Bar' AND "articles"."title" = 'Foo'
         ORDER BY "people"."id" DESC
       actual:
         SELECT "people".*
         FROM "people"
         LEFT OUTER JOIN "articles" ON "articles"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
         LEFT OUTER JOIN "articles" "articles_people" ON "articles_people"."person_id" = "people"."id" AND ('default_scope' = 'default_scope')
         WHERE "articles"."title" = 'Bar' AND "articles"."title" = 'Foo'
         ORDER BY "people"."id" DESC
     # ./spec/ransack/adapters/active_record/context_spec.rb:140:in `block (3 levels) in <module:ActiveRecord>'

Finished in 1.31 seconds (files took 3.58 seconds to load)
16 examples, 3 failures

Failed examples:

rspec ./spec/ransack/adapters/active_record/context_spec.rb:116 # Ransack::Adapters::ActiveRecord::Context#join_dependency or likely #build_joins should reuse existing inner joins
rspec ./spec/ransack/adapters/active_record/context_spec.rb:130 # Ransack::Adapters::ActiveRecord::Context#join_dependency or likely #build_joins should reuse existing left_outer_joins
rspec ./spec/ransack/adapters/active_record/context_spec.rb:136 # Ransack::Adapters::ActiveRecord::Context#join_dependency or likely #build_joins should reuse existing joins from ransack

Coverage report generated for RSpec to /Users/steven/Dev/ransack/coverage. 162 / 198 LOC (81.82%) covered.
SimpleCov failed with exit 1

Debugging Leads

I noticed that lib/ransack/adapters/active_record/context.rb:259 build_joins is trying to duplicate the ActiveRecord::QueryMethods#build_joins pattern without actually calling those methods. Unfortunately, Rails has modified this internal/private method most every minor version https://github.com/rails/rails/blob/4-2-stable/activerecord/lib/active_record/relation/query_methods.rb#L1016 https://github.com/rails/rails/blob/5-0-stable/activerecord/lib/active_record/relation/query_methods.rb#L1042 https://github.com/rails/rails/blob/5-1-stable/activerecord/lib/active_record/relation/query_methods.rb#L987 https://github.com/rails/rails/blob/5-2-stable/activerecord/lib/active_record/relation/query_methods.rb#L990 https://github.com/rails/rails/blob/6-0-stable/activerecord/lib/active_record/relation/query_methods.rb#L1127 https://github.com/rails/rails/blob/6-1-stable/activerecord/lib/active_record/relation/query_methods.rb#L1260 https://github.com/rails/rails/blob/main/activerecord/lib/active_record/relation/query_methods.rb#L1443 (as of yet unchanged for 6-2, just more lines above it)

For the Ransack implementation, the previous left_outer_joins ends up in

stashed_association_joins = buckets[:stashed_join]

which, afaict, goes unused.

P.S. If I had the time and inclination, I'd try to contribute refactors to Rails 6.2 to make these methods more discrete/compositional before trying to copy or module-include to Ransack and modify only the couple parts needed.

I regret to open a new issue when Ransack has several open issues for JOINS, but those are mostly specific to intentional joins (inheritance and aliases etc).

I expect there is a root-cause for many of them, but the relevant code is beyond my time available for this issue.

hcyildirim commented 1 year ago

have you found any solution to this? @stevenjonescgm