soundcloud / lhm

Online MySQL schema migrations
BSD 3-Clause "New" or "Revised" License
1.83k stars 190 forks source link

Adding failing test and fix for potential data destruction #129

Closed camilo closed 8 years ago

camilo commented 9 years ago

Problem

This is actually a big deal, it seems it is possible to destroy data for a table with a single row when the single's row id is not 1.

It seems connected to this patch https://github.com/soundcloud/lhm/pull/39/files#diff-5f043f4ad9ef9057e658146216f47dbfR28.

Solution

We'll follow up with a patch ASAP, for now I want a sanity check, Am I not seeing something that would prevent this piece of code:

 def execute
      return unless @start && @limit
      @next_to_insert = @start
      while @next_to_insert < @limit || (@next_to_insert == 1 && @start == 1)
        stride = @throttler.stride
        affected_rows = @connection.update(copy(bottom, top(stride)))

        if @throttler && affected_rows > 0
          @throttler.run
        end

from skipping a single row if that rows id were say 1001 ?

@jasonhl @sroysen @singerwang @oliverf @arthurnn @erikogan @pellegrino @durran

camilo commented 9 years ago

test run with this:

00% complete
....Everything is clean. Nothing to do.
100% complete
100% complete
100% complete
100% complete
100% complete
100% complete
100% complete
100% complete
.........

Finished in 29.573579s, 1.8936 runs/s, 3.2461 assertions/s.

  1) Failure:
Lhm::Chunker::copying#test_0001_should copy 1  rows from origin to destination even if the id of the single row does not start at 1 [/home/vagrant/src/lhm/spec/integration/chunker_spec.rb:26]:
Expected: 1
  Actual: 0
camilo commented 9 years ago

Naive fix added.

ghost commented 9 years ago

Actually the patch (https://github.com/soundcloud/lhm/pull/39/files#diff-5f043f4ad9ef9057e658146216f47dbfR28) was not the core issue..

So we have two cases with 1 row:

Before the patch, the logic was

until @next_to_insert >= @limit

Which means it will fail both CASE1 and CASE2. So basically any table with 1 row will fail.

The patch fixes the logic for CASE1, but still fails CASE2. So basically now it works for any table with 1 row as long as the row id is 1

while @next_to_insert < @limit || (@next_to_insert == 1 && @start == 1)

I think a cleaner solution is

until @next_to_insert > @limit
arthurnn commented 9 years ago

i guess this is related to https://github.com/soundcloud/lhm/pull/127

camilo commented 9 years ago

@arthurnn it is, can we get either merged ?

camilo commented 8 years ago

ping @arthurnn

arthurnn commented 8 years ago

thanks bro ! :)

arthurnn commented 8 years ago

sorry for the delay reviewing this.

tdeo commented 8 years ago

This isn't exactly the same fix as what I had in mind there. I have the impression that if the number of lines is a multiple of stride + 1 this will still fail lines = k*stride + 1, and this fix only takes care of the case k = 0.

Should I try to rebase my PR on the latest master to see if the test I added passes?

arthurnn commented 8 years ago

@tdeo go ahead, with the rebase and let me know if there is a problem to be fixed.