rom-rb / rom

Data mapping and persistence toolkit for Ruby
https://rom-rb.org
MIT License
2.07k stars 161 forks source link

Batch insert with one INSERT query #680

Closed paddor closed 2 years ago

paddor commented 2 years ago

Using the changeset API, I see that many INSERT queries are executed one by one. It should be one query with multiple row values.

solnic commented 2 years ago

How is that a bug?

paddor commented 2 years ago

I would have expected better from a data mapper library. Or is there a way to do it in ROM?

solnic commented 2 years ago

OK but this is a potential optimization, the feature isn't broken. Can you come up with a reproduction script?

paddor commented 2 years ago

Here you go:

require 'logger'
require 'securerandom'
require 'rom'
require 'rom-sql'

class UsersRelation < ROM::Relation[:sql]
  schema :users, infer: true

  def self.create_table(gateway)
    gateway.create_table(:users) do
      primary_key :id
      column      :name, String, null: false
    end
  end
end

class Repo < ROM::Repository[:users]
  def insert_fake_records(n)
    users.transaction do
      records = Array.new(n) do
        {
          name: SecureRandom.uuid,
        }
      end

      # FIXME: This causes many INSERTs instead of one.
      # FIXME: result: :one doesn't seem to do do anything different than :many. Also, :none is missing.
      users.command(:create, result: :one).call records
    end
  end
end

logger = Logger.new STDERR
config = ROM::Configuration.new :sql, 'sqlite::memory' do |config|
  config.default.use_logger logger
end

UsersRelation.create_table config.gateways[:default]
config.register_relation UsersRelation # NOTE: If done before table creation, this can lead to a race condition.

container = ::ROM.container config
repo      = Repo.new container

repo.insert_fake_records 10

Currently the output looks like this:

I, [2022-05-19T11:54:44.523035 #554317]  INFO -- : (0.000163s) CREATE TABLE `users` (`id` integer NOT NULL PRIMARY KEY AUTOINCREMENT, `name` varchar(255) NOT NULL)
I, [2022-05-19T11:54:44.526123 #554317]  INFO -- : (0.000151s) SELECT sqlite_version()
I, [2022-05-19T11:54:44.526516 #554317]  INFO -- : (0.000163s) PRAGMA table_xinfo('users')
I, [2022-05-19T11:54:44.527421 #554317]  INFO -- : (0.000071s) PRAGMA index_list('users')
I, [2022-05-19T11:54:44.527665 #554317]  INFO -- : (0.000072s) PRAGMA foreign_key_list('users')
I, [2022-05-19T11:54:44.529088 #554317]  INFO -- : (0.000153s) SELECT NULL AS 'nil' FROM `users` LIMIT 1
I, [2022-05-19T11:54:44.532041 #554317]  INFO -- : (0.000049s) BEGIN
I, [2022-05-19T11:54:44.538489 #554317]  INFO -- : (0.000559s) INSERT INTO `users` (`name`) VALUES ('59c68a6f-7360-4401-a681-9d51f25251a0')
I, [2022-05-19T11:54:44.538791 #554317]  INFO -- : (0.000045s) INSERT INTO `users` (`name`) VALUES ('0f39f231-04fd-4456-ae94-5768686b5eba')
I, [2022-05-19T11:54:44.539023 #554317]  INFO -- : (0.000038s) INSERT INTO `users` (`name`) VALUES ('82f0f19f-1f08-477c-b313-5ea5f85081bb')
I, [2022-05-19T11:54:44.539276 #554317]  INFO -- : (0.000063s) INSERT INTO `users` (`name`) VALUES ('e60808b0-bce2-49bf-95c5-f94bcd7d8b9b')
I, [2022-05-19T11:54:44.539614 #554317]  INFO -- : (0.000042s) INSERT INTO `users` (`name`) VALUES ('11407bdb-5906-4e0e-ae6d-7010ddc9b8b7')
I, [2022-05-19T11:54:44.539892 #554317]  INFO -- : (0.000040s) INSERT INTO `users` (`name`) VALUES ('8215e251-e985-4f50-b64f-25b169744bae')
I, [2022-05-19T11:54:44.540282 #554317]  INFO -- : (0.000078s) INSERT INTO `users` (`name`) VALUES ('85a0bf57-d213-4c69-90ab-23f78f262644')
I, [2022-05-19T11:54:44.540555 #554317]  INFO -- : (0.000043s) INSERT INTO `users` (`name`) VALUES ('15ad4d04-f8bb-47c8-8304-0ee9fd416ff3')
I, [2022-05-19T11:54:44.540881 #554317]  INFO -- : (0.000041s) INSERT INTO `users` (`name`) VALUES ('979c07cb-4188-44c6-afe1-2a33df2cb4d6')
I, [2022-05-19T11:54:44.541286 #554317]  INFO -- : (0.000069s) INSERT INTO `users` (`name`) VALUES ('552cb135-5b74-425d-9ad2-4cc8cd916ebc')
I, [2022-05-19T11:54:44.543076 #554317]  INFO -- : (0.000411s) SELECT `users`.`id`, `users`.`name` FROM `users` WHERE (`id` IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)) ORDER BY `users`.`id`
I, [2022-05-19T11:54:44.543482 #554317]  INFO -- : (0.000042s) COMMIT

I'd like to only see one INSERT (and also a way to avoid the trailing SELECT).

paddor commented 2 years ago

I noticed UsersRelation#multi_insert actuall does what I need:

class Repo < ROM::Repository[:users]
  def fast_insert_fake_records(n)
    records = Array.new(n) do
      {
        name: SecureRandom.uuid,
      }
    end

    users.multi_insert records
  end
end
I, [2022-05-19T12:33:43.386272 #565843]  INFO -- : (0.000040s) BEGIN
I, [2022-05-19T12:33:43.386474 #565843]  INFO -- : (0.000082s) INSERT INTO `users` (`name`) VALUES ('5d9d30e0-e110-4489-ae04-48471dcc4f9a'), ('e995cf0b-832f-4781-a420-c4e00dc70f36'), ('071382f3-4b5d-4d8b-a60c-93fcc6d78ccb'), ('ade35aae-2d7a-4cb8-8bdf-717968682e17'), ('a341e2d7-84c6-4444-a085-206bfc7c0d6f'), ('99eac8bb-1185-4ea0-8b3a-00d78b559bae'), ('6786229d-2de5-43a4-b174-886240992977'), ('72339d0b-9a30-4727-92ae-531b40a0c6da'), ('ace9fa8e-f273-4c27-b437-7addc28c519d'), ('14efed71-a554-4f8a-8b4e-c30b8ce1cb08')
I, [2022-05-19T12:33:43.387374 #565843]  INFO -- : (0.000171s) COMMIT
solnic commented 2 years ago

Looks like behavior in Sqlite is different when you do #multi_insert(tuples, return: :primary_key) vs #multi_insert(tuples). In case of the former, multiple inserts are executed. We can optimize it by introducing an sqlite-specific create command. I'm gonna schedule it for 4.0.0 and it can be backported if somebody finds time to do so.

solnic commented 2 years ago

Closing in favor of https://github.com/rom-rb/rom-sql/issues/411