Query on slave returning wrong results in test

agis commented 7 years ago

Hey,

After switching some queries to be sent to the slaves, I've noticed a specific test failing due to the on_slave scope returning wrong results:

[2] pry(#<RSpec::ExampleGroups::ShopsController::GETIndex>)> Shop.where("created_on > ?", 5.days.ago)
[master]  Shop Load (1.2ms)  SELECT `shops`.* FROM `shops` WHERE (created_on > '2017-01-14 11:54:33.639784')
 # => [#<Shop:0x007fa103931290>]
# returns the `Shop` record as expected, test succeeds

[3] pry(#<RSpec::ExampleGroups::ShopsController::GETIndex>)> Shop.on_slave.where("created_on > ?", 5.days.ago)
[slave]  Shop Load (0.8ms)  SELECT `shops`.* FROM `shops` WHERE (created_on > '2017-01-14 11:56:41.160893')
# => [] 
# test fails

Oddly enough, Slavery.on_slave still sends the queries to the master:

[4] pry(#<RSpec::ExampleGroups::ShopsController::GETIndex>)> Slavery.on_slave { Shop.where("created_on > ?", 5.days.ago) }
[master]  Shop Load (0.9ms)  SELECT `shops`.* FROM `shops` WHERE (created_on > '2017-01-14 11:58:43.070048')
# => [#<Shop:0x007fa103931290>]

The failing test is roughly the following:

  describe "foo" do
    let!(:shop) { FactoryGirl.create(:shop) }

    it "foooooo" do
      shop.update_attribute :created_on, 3.days.ago
      shop.update_attribute :updated_on, 3.days.ago

      get :index, { :since => 5.days.ago }
      expect(assigns(:shops)).to eq([shop])
    end
  end

I've also tried reducing the query to Shop.count/Shop.on_slave.count, same issue: the result is 1 vs. 0.

Any ideas?

Thanks

agis commented 7 years ago

A simpler demonstration, inside the same test:

[7] pry(#<RSpec::ExampleGroups::ShopsController::GETIndex>)> Shop.count
[master]   (0.6ms)  SELECT COUNT(*) FROM `shops`
=> 1
[8] pry(#<RSpec::ExampleGroups::ShopsController::GETIndex>)> Slavery.on_slave { Shop.count }
[slave]   (0.6ms)  SELECT COUNT(*) FROM `shops`
=> 0
[9] pry(#<RSpec::ExampleGroups::ShopsController::GETIndex>)> Shop.on_slave.count
[slave]   (0.6ms)  SELECT COUNT(*) FROM `shops`
=> 0

agis commented 7 years ago

It seems that Slavery doesn't execute the query on master if it's inside a transactional test, therefore cannot see the changes performed inside the transaction which started on master.

Is this intented? Shouldn't we detect that we're inside a test's transaction (ie transactional fixtures is on) and run all queries on master?

If that sounds good I could prepare a PR.

agis commented 7 years ago

@kenn Any thoughts on this?

kenn commented 7 years ago

Slavery cannot be used inside a transaction - if you try, you'll see an exception. Only exception allowed is transactional fixtures for tests.

Do you actually use separate database for test and test_slave? Otherwise it's weird to see different results anyway. If you point to the same database with read-only access you should see the same result no matter what.

And if you use separate databases, there will be delay synching from master to slave, which makes the test indeterministic depending on the timing.

Either way, I think it's an issue specific to your test, not of Slavery.

agis commented 7 years ago

I use a single database in test so test and test_slave are identical. The access isn't read-only but I don't think this is the cause of this failure.

I think the failure is expected: if a transaction is started in connection A, then another connection B can't see the changes of the connection A until the transaction is commited (and it's never commited in a test).

I've created a sample app demonstrating the issue: https://github.com/agis-/slavery-gh-15; commits are separated logically so it might help to look at them. The test case can be found at https://github.com/agis-/slavery-gh-15/blob/master/test/models/user_test.rb.

In other words, when transactional fixtures are enabled, the following sounds expected to me:

test "foo" do
  assert_equal 0, User.count
  assert_equal 0, User.on_slave.count

  # this happens in a transaction in `master` connection (let's call it connection A)
  # but is *not* yet commited, therefore only connection `A` knows about this change
  User.create!

 # this succeeds since it runs in A
  assert_equal 1, User.count

 # this fails since it runs in another connection (let's call it `B`) which has no knowledge of
 # changes in A (since that transaction is not commited yet)
  assert_equal 1, User.on_slave.count
end # transaction is rolled back

kenn commented 7 years ago

Are you using sqlite3 for the test? It's not suppored for nested transactions. Can you try either MySQL or PostgreSQL?

Also I'm not sure why you think User.create! is not committed.

In any case, I already mentioned that Slavery does not support transaction, so you can either force commit it or just use fixture.

agis commented 7 years ago

The issue persists with MySQL (commit).

User.create! does not create the record because this is how transactional fixtures work. The transaction is not commited but is rolled-back at the end of the test. log/test.log demonstrates this behavior, which is expected when using transactional fixtures:

[master]   (0.1ms)  SET FOREIGN_KEY_CHECKS = 1
[master]   (0.1ms)  BEGIN
------------------
UserTest: test_foo
------------------
[master]   (0.2ms)  SELECT COUNT(*) FROM `users`
[slave]   (0.3ms)  SELECT COUNT(*) FROM `users`
[master]   (0.1ms)  SAVEPOINT active_record_1
[master]  SQL (0.2ms)  INSERT INTO `users` (`created_at`, `updated_at`) VALUES ('2017-01-24 10:34:25', '2017-01-24 10:34:25')
[master]   (0.1ms)  RELEASE SAVEPOINT active_record_1
[master]   (0.2ms)  SELECT COUNT(*) FROM `users`
[slave]   (0.2ms)  SELECT COUNT(*) FROM `users`
[master]   (0.6ms)  ROLLBACK

We're using FactoryGirls instead of Rails' fixtures, but I don't think that's an issue. The issue is this:

If transactional fixtures is enabled, any slave queries inside a test don't see the database writes performed on the master.

The above statement is verified by the provided test case in the sample app.

kenn commented 7 years ago

Ah ok, thanks for clarification. Then yeah I agree that the issue is obscure and we need a better way to handle this by default.

Off the top of my head, we could automatically set Slavery.disabled = true when we detect use_transactional_fixtures = true (or in any test?) but I'm not sure if that's a good idea.

What's your suggestion?

agis commented 7 years ago

I was thinking something like this, which is minimally intrusive I think:

diff --git a/lib/slavery/base.rb b/lib/slavery/base.rb
index f061bb5..eaa702f 100644
--- a/lib/slavery/base.rb
+++ b/lib/slavery/base.rb
@@ -13,7 +13,7 @@ module Slavery
     def decide_with(target)
       raise Slavery::Error.new('on_slave cannot be used inside transaction block!') if inside_transaction?

-      if Slavery.disabled
+      if Slavery.disabled || inside_transactional_fixture?
         :master
       else
         target
@@ -25,6 +25,13 @@ module Slavery
       open_transactions > Slavery::Transaction.base_depth
     end

+    def inside_transactional_fixture?
+      testcase = ActiveSupport::TestCase
+      defined?(testcase) &&
+        testcase.respond_to?(:use_transactional_fixtures) &&
+        testcase.try(:use_transactional_fixtures)
+    end
+

Thoughts?

kenn commented 7 years ago

The rationale for global disable in tests is that, we don't usually care if a read is from master or slave. It should just work.

But in a specific test case where you want to test the behavior of Slavery, you can turn it back on (Slavery.disabled = false) - but the same issue is still there if you write to the database in the test. We can mention in README that we should not write to the database though.

kenn commented 7 years ago

Hmm, the change makes it impossible to test the behavior because inside_transactional_fixture? is baked in. I would change the default value of Slavery.disabled instead, so you can easily change where you need to do so.

agis commented 7 years ago

Hm yeah, changing the default seems like the way to go. At this point I'm thinking if just a README section instructing users to disable slavery manually while running tests is a better solution.

Anyway, I'm submitting a PR.

kenn commented 7 years ago

On second thought, I'm hesitant to disable entirely in all tests.

It just gives a false illusion that Slavery doesn't break anything.
If it's just Slavery.disable = true, you can do it in test_helper.
I think missing uncommitted data can be a good test to find that slave read can lag behind master. You shouldn't use on_slave where immediate consistency is required.

So instead, I would suggest something like this in test_helper:

if ActiveRecord::Base.connection.supports_transaction_isolation?
  ActiveRecord::Base.connection.execute('SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED')
  ...
end

or

User.transaction(isolation: :read_uncommitted) do
  ...
end

in a specific test where you actually want phantom read. It's probably a documentation issue more or less.

What do you think?

agis commented 7 years ago

Totally agreed.

Personally I'd go with a Slavery.disabled = true as I think it's just fine. I can't think of a use-case where you'd like to use slave reads in a test environment.

Do you think we should mention the transaction isolation stuff in the README? I think it might be too much for the common case (where ppl are using a single database for testing).

I think something like this is enough:

Slavery does not work with Rails' transactional fixtures, since the slave connection is not aware of the changes performed in the master connection. You are suggested to disable Slavery in the test environment by putting the following in test/test_helper.rb or spec/spec_helper.rb:
Slavery.disabled = true

kenn commented 7 years ago

It works just fine with transactional fixtures (I do it all the time), but fails when and only when you write to the database inside the same test. So:

When use_transactional_fixtures is set to true, it's NOT recommended to write to the database besides fixtures, since the slave connection is not aware of the changes being performed in the master connection due to transaction isolation. In that case, you are suggested to disable...

is better?

agis commented 7 years ago

@kenn Updated: https://github.com/kenn/slavery/pull/16. Feel free to edit it as you wish.

kenn / standby

Query on slave returning wrong results in test #15