jamis / bulk_insert

Efficient bulk inserts with ActiveRecord
MIT License
819 stars 84 forks source link

bulk_insert does not return a worker object #66

Open fredwillmore opened 4 years ago

fredwillmore commented 4 years ago

I am using version 1.8.1, ruby 2.3.8, rails 4.2.5 I am trying to use non-block mode, passing an array of attribute hashes. My model is named Gap:

worker = Gap.bulk_insert(return_primary_keys: true)

however, bulk_insert returns nil. (block mode also returns nil). Looking at the code it looks like it only returns a worker if bulk_insert is only called without values or a block:

    def bulk_insert(*columns, values: nil, set_size:500, ignore: false, update_duplicates: false, return_primary_keys: false)
      columns = default_bulk_columns if columns.empty?
      worker = BulkInsert::Worker.new(connection, table_name, primary_key, columns, set_size, ignore, update_duplicates, return_primary_keys)

      if values.present?
        transaction do
          worker.add_all(values)
          worker.save!
        end
        nil
      elsif block_given?
        transaction do
          yield worker
          worker.save!
        end
        nil
      else
        worker
      end
    end

How would I access the worker object in order to look at result_sets? thanks, fw

mberlanda commented 3 years ago

hi @fredwillmore , it looks like the behaviour you are describing is covered by unit tests https://github.com/jamis/bulk_insert/blob/master/test/bulk_insert_test.rb . Can you confirm you can reproduce it consistently?

travismiller commented 3 years ago

I was surprised by this as well.

The test sets an outer result variable from inside the block. So this is the way to access the worker and result_sets when using values or a block with the existing code.

https://github.com/jamis/bulk_insert/blob/ab5db0873098701904ac2fcdcab86e417d342895/test/bulk_insert_test.rb#L9-L13

However, the README indicates that a worker would be returned when using a block.

https://github.com/jamis/bulk_insert/blob/ab5db0873098701904ac2fcdcab86e417d342895/README.md#L176-L192

As mentioned in the original issue comment, the worker is only returned if no values are present and no block is given.

https://github.com/jamis/bulk_insert/blob/ab5db0873098701904ac2fcdcab86e417d342895/lib/bulk_insert.rb#L11-L25

I think it should be appropriate to always return the worker. Maybe something like this?

def bulk_insert #…
  # …

  if values.present?
    transaction do
      worker.add_all(values)
      worker.save!
    end
  elsif block_given?
    transaction do
      yield worker
      worker.save!
    end
  else

  worker
end
mberlanda commented 3 years ago

@jamis do you remember why you choose to return nil when implementing this feature here? https://github.com/jamis/bulk_insert/commit/95a0c43a09e745dc7f79bd766336890ca86a19bb

jamis commented 3 years ago

I don't remember exactly. :/ However, looking at the diff, I suspect it was an attempt to make the return value consistent, so that it always returns either a worker object, or nil. I vaguely remember being worried about the consequences of someone trusting the method to return self, and introducing a bug when they got a worker object back instead.