rails-on-services / apartment

Database multi-tenancy for Rack (and Rails) applications
378 stars 156 forks source link

Multiple SELECT DATABASE() being triggered in MySQL #228

Open stephanohainz opened 1 year ago

stephanohainz commented 1 year ago

We are seeing a lot of SELECT database() being shown at the MySQL logs.

Steps to reproduce

Configuration

Using the "Switching Tenants per request" configuration proposed on documentation.

Run a HTTP request to a specific tenant. In our case, we are just doing a generic HTTP request to retrieve some data and doing some basic validations in the model.

# config/initializers/apartment.rb

require 'apartment/elevators/subdomain'

Apartment.configure do |config|
  config.tenant_names = [ 
    'customer1',
    'customer2'
  ]
end
# config/database.yml

development:
  adapter: mysql2
  database: customer1
  encoding: utf8
  host: db
  username: root
  password: 
  pool: 5
  variables:
    sql_mode: NO_ENGINE_SUBSTITUTION

Expected behavior

Make a single tenant change per request.

Actual behavior

I came to realize that there has been a very large consumption of the query database "SELECT database()" in a multi-tenant environment, where an HTTP request is made at a generic endpoint and this request triggers this large number of selects.

MySQL Logs:

mysqld, Version: 5.7.30 (MySQL Community Server (GPL)). started with:
Tcp port: 3306  Unix socket: /var/run/mysqld/mysqld.sock
Time                 Id Command    Argument

2023-07-03T21:34:28.671560Z    25 Query SELECT database()
2023-07-03T21:34:28.671825Z    25 Query SELECT database()
2023-07-03T21:34:28.672025Z    25 Query use `customer1`
2023-07-03T21:34:28.672476Z    25 Query SELECT database()
2023-07-03T21:34:28.672656Z    25 Query SELECT database()
2023-07-03T21:34:28.672834Z    25 Query use `customer1`
2023-07-03T21:34:28.710948Z    25 Query SHOW FULL FIELDS FROM `maint_orders`
2023-07-03T21:34:28.720923Z    25 Query SELECT column_name
2023-07-03T21:34:28.746213Z    25 Query SHOW FULL FIELDS FROM `users`
2023-07-03T21:34:28.749689Z    25 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'users'
ORDER BY ordinal_position
2023-07-03T21:34:28.750536Z    25 Query SELECT  `users`.* FROM `users` WHERE `users`.`deleted_at` IS NULL AND `users`.`id` = 4 LIMIT 1
2023-07-03T21:34:28.780610Z    25 Query SHOW FULL FIELDS FROM `companies`
2023-07-03T21:34:28.782284Z    25 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'companies'
ORDER BY ordinal_position
2023-07-03T21:34:28.782935Z    25 Query SELECT  `companies`.* FROM `companies` WHERE `companies`.`deleted_at` IS NULL AND `companies`.`id` = 2 LIMIT 1
2023-07-03T21:34:28.801702Z    25 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'authentication_methods'
ORDER BY ordinal_position
2023-07-03T21:34:28.802280Z    25 Query SHOW FULL FIELDS FROM `authentication_methods`
2023-07-03T21:34:28.848771Z    25 Query SHOW FULL FIELDS FROM `user_groups`
2023-07-03T21:34:28.850215Z    25 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'user_groups'
ORDER BY ordinal_position
2023-07-03T21:34:28.850966Z    25 Query SELECT  `user_groups`.* FROM `user_groups` WHERE `user_groups`.`deleted_at` IS NULL AND `user_groups`.`id` = 2 LIMIT 1
2023-07-03T21:34:28.864742Z    25 Query SHOW FULL FIELDS FROM `actions`
2023-07-03T21:34:28.867247Z    25 Query SHOW FULL FIELDS FROM `actions_user_groups`
2023-07-03T21:34:28.868195Z    25 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'actions'
ORDER BY ordinal_position
2023-07-03T21:34:28.870079Z    25 Query SELECT `actions`.`action`, `actions`.`subject_class` FROM `actions` INNER JOIN `actions_user_groups` ON `actions`.`id` = `actions_user_groups`.`action_id` WHERE `actions`.`deleted_at` IS NULL AND `actions_user_groups`.`user_group_id` = 2
2023-07-03T21:34:29.764176Z    25 Query SHOW FULL FIELDS FROM `maint_order_statuses`
2023-07-03T21:34:29.816786Z    25 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database() AND table_name = 'maint_subjects'
2023-07-03T21:34:29.929402Z    25 Query SELECT database()
2023-07-03T21:34:29.929731Z    25 Query SELECT database()
2023-07-03T21:34:29.930064Z    25 Query SELECT database()
2023-07-03T21:34:29.930273Z    25 Query SELECT database()
2023-07-03T21:34:29.931197Z    25 Query SELECT database()
2023-07-03T21:34:29.931387Z    25 Query SELECT database()
2023-07-03T21:34:29.931590Z    25 Query SELECT database()
2023-07-03T21:34:29.931766Z    25 Query SELECT database()
2023-07-03T21:34:29.932100Z    25 Query SELECT database()
2023-07-03T21:34:29.932276Z    25 Query SELECT database()
2023-07-03T21:34:30.784164Z    26 Query use `customer2`
2023-07-03T21:34:33.687876Z    26 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database()
2023-07-03T21:34:34.029500Z    25 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database() AND table_name = 'spotlights'
2023-07-03T21:34:34.362792Z    26 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database() AND table_name = 'schema_migrations' AND table_type = 'BASE TABLE'
2023-07-03T21:34:34.365328Z    26 Query SHOW FULL FIELDS FROM `schema_migrations`
2023-07-03T21:34:34.366613Z    26 Query SELECT `schema_migrations`.`version` FROM `schema_migrations` ORDER BY `schema_migrations`.`version` ASC
2023-07-03T21:34:34.414296Z    26 Query SELECT database()
2023-07-03T21:34:34.414642Z    26 Query SELECT database()
2023-07-03T21:34:34.414972Z    26 Query use `customer1`
2023-07-03T21:34:34.415671Z    26 Query SELECT database()
2023-07-03T21:34:34.415924Z    26 Query SELECT database()
2023-07-03T21:34:34.416378Z    26 Query use `customer1`
2023-07-03T21:34:35.429102Z    25 Query SHOW FULL FIELDS FROM `areas`
2023-07-03T21:34:35.432520Z    25 Query SHOW FULL FIELDS FROM `areas_users`
2023-07-03T21:34:35.433391Z    25 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'areas'
ORDER BY ordinal_position
2023-07-03T21:34:35.434126Z    25 Query SELECT `areas`.* FROM `areas` INNER JOIN `areas_users` ON `areas`.`id` = `areas_users`.`area_id` WHERE `areas`.`deleted_at` IS NULL AND `areas_users`.`user_id` = 4 ORDER BY `areas`.`code` ASC
2023-07-03T21:34:35.434789Z    25 Query SELECT database()
2023-07-03T21:34:35.434956Z    25 Query SELECT database()
2023-07-03T21:34:35.436045Z    25 Query SELECT database()
2023-07-03T21:34:35.436479Z    25 Query SELECT database()
2023-07-03T21:34:35.436906Z    25 Query SELECT database()
2023-07-03T21:34:35.437260Z    25 Query SELECT database()
2023-07-03T21:34:35.437833Z    25 Query SELECT database()
2023-07-03T21:34:35.438182Z    25 Query SELECT database()
2023-07-03T21:34:35.438592Z    25 Query SELECT database()
2023-07-03T21:34:35.438945Z    25 Query SELECT database()
2023-07-03T21:34:35.439513Z    25 Query SELECT database()
2023-07-03T21:34:35.439865Z    25 Query SELECT database()
2023-07-03T21:34:35.440400Z    25 Query SELECT database()
2023-07-03T21:34:35.440751Z    25 Query SELECT database()
2023-07-03T21:34:35.441162Z    25 Query SELECT database()
2023-07-03T21:34:35.441515Z    25 Query SELECT database()
2023-07-03T21:34:35.441937Z    25 Query SELECT database()
2023-07-03T21:34:35.442294Z    25 Query SELECT database()
2023-07-03T21:34:35.442708Z    25 Query SELECT database()
2023-07-03T21:34:35.443090Z    25 Query SELECT database()
2023-07-03T21:34:35.443470Z    25 Query SELECT database()
2023-07-03T21:34:35.443884Z    25 Query SELECT database()
2023-07-03T21:34:35.445201Z    25 Query SHOW FULL FIELDS FROM `maint_orders`
2023-07-03T21:34:35.450541Z    25 Query SELECT database()
2023-07-03T21:34:35.450822Z    25 Query SELECT database()
2023-07-03T21:34:35.451164Z    25 Query SELECT database()
2023-07-03T21:34:35.451369Z    25 Query SELECT database()
2023-07-03T21:34:35.451598Z    25 Query SELECT database()
2023-07-03T21:34:35.451803Z    25 Query SELECT database()
2023-07-03T21:34:35.452003Z    25 Query SELECT database()
2023-07-03T21:34:35.452213Z    25 Query SELECT database()
2023-07-03T21:34:35.452470Z    25 Query SELECT database()
2023-07-03T21:34:35.452672Z    25 Query SELECT database()
2023-07-03T21:34:35.452881Z    25 Query SELECT database()
2023-07-03T21:34:35.453086Z    25 Query SELECT database()
2023-07-03T21:34:35.453384Z    25 Query SELECT database()
2023-07-03T21:34:35.453581Z    25 Query SELECT database()
2023-07-03T21:34:35.453852Z    25 Query SELECT database()
2023-07-03T21:34:35.454071Z    25 Query SELECT database()
2023-07-03T21:34:35.454307Z    25 Query SELECT database()
2023-07-03T21:34:35.454528Z    25 Query SELECT database()
2023-07-03T21:34:35.454757Z    25 Query SELECT database()
2023-07-03T21:34:35.454995Z    25 Query SELECT database()
2023-07-03T21:34:35.455216Z    25 Query SELECT database()
2023-07-03T21:34:35.455428Z    25 Query SELECT database()
2023-07-03T21:34:35.457090Z    25 Query SHOW FULL FIELDS FROM `maint_order_statuses`
2023-07-03T21:34:35.459275Z    25 Query SELECT `maint_order_statuses`.`id`, `maint_order_statuses`.`created_at`, `maint_order_statuses`.`updated_at`, `maint_order_statuses`.`deleted_at`, `maint_order_statuses`.`company_id`, `maint_order_statuses`.`maint_order_status_id`, `maint_order_statuses`.`sequence` FROM `maint_order_statuses` WHERE `maint_order_statuses`.`deleted_at` IS NULL AND `maint_order_statuses`.`company_id` IN (2, 1) ORDER BY maint_order_statuses.sequence IS NULL, maint_order_statuses.sequence ASC
2023-07-03T21:34:35.460178Z    25 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database() AND table_name = 'maint_order_statuses'
2023-07-03T21:34:35.460524Z    25 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'maint_order_statuses'
ORDER BY ordinal_position
2023-07-03T21:34:35.470792Z    25 Query SELECT `maint_order_statuses`.`id`, `maint_order_statuses`.`created_at`, `maint_order_statuses`.`updated_at`, `maint_order_statuses`.`deleted_at`, `maint_order_statuses`.`company_id`, `maint_order_statuses`.`maint_order_status_id`, `maint_order_statuses`.`sequence` FROM `maint_order_statuses` WHERE `maint_order_statuses`.`deleted_at` IS NULL AND `maint_order_statuses`.`id` IN (2, 1, 4) ORDER BY maint_order_statuses.sequence IS NULL, maint_order_statuses.sequence ASC

Possible fix

Analyzing and comparing the PostgreSQL adapter with mysql2, it is seen that the PostgreSQL adapter contains a way to set the current tenant, in which it is not necessary multiple selects database()

def current
  @current || default_tenant
end

When adapting the same function in the mysql2 adapter, is noticed a reduction in SELECT database() calls:

2023-07-03T21:31:01.074186Z    20 Query use `customer1`
2023-07-03T21:31:01.074861Z    20 Query use `customer1`
2023-07-03T21:31:01.127371Z    20 Query SHOW FULL FIELDS FROM `maint_orders`
2023-07-03T21:31:01.138069Z    20 Query SELECT column_name
2023-07-03T21:31:01.162087Z    20 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'users'
ORDER BY ordinal_position
2023-07-03T21:31:01.195179Z    20 Query SHOW FULL FIELDS FROM `companies`
2023-07-03T21:31:01.197296Z    20 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'companies'
ORDER BY ordinal_position
2023-07-03T21:31:01.198690Z    20 Query SELECT  `companies`.* FROM `companies` WHERE `companies`.`deleted_at` IS NULL AND `companies`.`id` = 2 LIMIT 1
2023-07-03T21:31:01.229738Z    20 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'authentication_methods'
ORDER BY ordinal_position
2023-07-03T21:31:01.230344Z    20 Query SHOW FULL FIELDS FROM `authentication_methods`
2023-07-03T21:31:01.283018Z    20 Query SHOW FULL FIELDS FROM `user_groups`
2023-07-03T21:31:01.284569Z    20 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'user_groups'
ORDER BY ordinal_position
2023-07-03T21:31:01.292061Z    20 Query SELECT  `user_groups`.* FROM `user_groups` WHERE `user_groups`.`deleted_at` IS NULL AND `user_groups`.`id` = 2 LIMIT 1
2023-07-03T21:31:01.314407Z    20 Query SHOW FULL FIELDS FROM `actions`
2023-07-03T21:31:01.316945Z    20 Query SHOW FULL FIELDS FROM `actions_user_groups`
2023-07-03T21:31:01.317991Z    20 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'actions'
ORDER BY ordinal_position
2023-07-03T21:31:01.320831Z    20 Query SELECT `actions`.`action`, `actions`.`subject_class` FROM `actions` INNER JOIN `actions_user_groups` ON `actions`.`id` = `actions_user_groups`.`action_id` WHERE `actions`.`deleted_at` IS NULL AND `actions_user_groups`.`user_group_id` = 2
2023-07-03T21:31:02.234669Z    20 Query SHOW FULL FIELDS FROM `maint_order_statuses`
2023-07-03T21:31:02.293272Z    20 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database() AND table_name = 'maint_subjects'
2023-07-03T21:31:03.264488Z    21 Query use `customer2`
2023-07-03T21:31:06.504242Z    21 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database()
2023-07-03T21:31:06.773033Z    20 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database() AND table_name = 'spotlights'
2023-07-03T21:31:07.266063Z    21 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database() AND table_name = 'schema_migrations' AND table_type = 'BASE TABLE'
2023-07-03T21:31:07.270677Z    21 Query SHOW FULL FIELDS FROM `schema_migrations`
2023-07-03T21:31:07.272494Z    21 Query SELECT `schema_migrations`.`version` FROM `schema_migrations` ORDER BY `schema_migrations`.`version` ASC
2023-07-03T21:31:07.320419Z    21 Query use `customer1`
2023-07-03T21:31:07.321533Z    21 Query use `customer1`
2023-07-03T21:31:08.177241Z    20 Query SHOW FULL FIELDS FROM `areas`
2023-07-03T21:31:08.180690Z    20 Query SHOW FULL FIELDS FROM `areas_users`
2023-07-03T21:31:08.181616Z    20 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'areas'
ORDER BY ordinal_position
2023-07-03T21:31:08.182512Z    20 Query SELECT `areas`.* FROM `areas` INNER JOIN `areas_users` ON `areas`.`id` = `areas_users`.`area_id` WHERE `areas`.`deleted_at` IS NULL AND `areas_users`.`user_id` = 4 ORDER BY `areas`.`code` ASC
2023-07-03T21:31:08.186894Z    20 Query SHOW FULL FIELDS FROM `maint_orders`
2023-07-03T21:31:08.194373Z    20 Query SHOW FULL FIELDS FROM `maint_order_statuses`
2023-07-03T21:31:08.196363Z    20 Query SELECT `maint_order_statuses`.`id`, `maint_order_statuses`.`created_at`, `maint_order_statuses`.`updated_at`, `maint_order_statuses`.`deleted_at`, `maint_order_statuses`.`company_id`, `maint_order_statuses`.`maint_order_status_id`, `maint_order_statuses`.`sequence` FROM `maint_order_statuses` WHERE `maint_order_statuses`.`deleted_at` IS NULL AND `maint_order_statuses`.`company_id` IN (2, 1) ORDER BY maint_order_statuses.sequence IS NULL, maint_order_statuses.sequence ASC
2023-07-03T21:31:08.197146Z    20 Query SELECT table_name FROM information_schema.tables WHERE table_schema = database() AND table_name = 'maint_order_statuses'
2023-07-03T21:31:08.197448Z    20 Query SELECT column_name
FROM information_schema.key_column_usage
WHERE constraint_name = 'PRIMARY'
  AND table_schema = database()
  AND table_name = 'maint_order_statuses'
ORDER BY ordinal_position
2023-07-03T21:31:08.206447Z    20 Query SELECT `maint_order_statuses`.`id`, `maint_order_statuses`.`created_at`, `maint_order_statuses`.`updated_at`, `maint_order_statuses`.`deleted_at`, `maint_order_statuses`.`company_id`, `maint_order_statuses`.`maint_order_status_id`, `maint_order_statuses`.`sequence` FROM `maint_order_statuses` WHERE `maint_order_statuses`.`deleted_at` IS NULL AND `maint_order_statuses`.`id` IN (2, 1, 4) ORDER BY maint_order_statuses.sequence IS NULL, maint_order_statuses.sequence ASC

System configuration

Ruby version

adam-hampton-sp commented 1 year ago

I am seeing very similar behavior in our APM traces. One PUMA request thread, connect via ROS Apartment, query for models and resulting in dozens to hundreds of SELECT DATABASE() calls. Our configuration:

It amounts to adding dozens to hundreds of quick DB round trips to our app. Each statement on its own might only take milliseconds, but lots of them add up.

This heavily redacted APM trace shows it happening. All those purple slices at the bottom of the flame graph are the DB dips. This is one PUMA transaction through some Rails controllers. The whole operation took ~63 seconds (hence why we were looking at it). All those tiny DB dips are diminishing performance.

apm-redacted
adam-hampton-sp commented 1 year ago

Update here for other parties reading along. We have found root cause inside our application and are taking steps to fix it. The root cause is:

The Apartment gem is not doing any kind of thread-local, puma-worker-local, or sidekiq-worker local caching or memo-ization of this value. Every time .current is invoked it's a network traversal and round trip to the DB engine.

Our application code assumed that was a cheap, local in-RAM call. Ooops on our part, and a bad assumption.

The solution we are deploying is application specific but amounts to replacing every call to Apartment::Tenant.current() with a call to our own code that memoizes the current tenant in a thread local variable. We've replaced literally dozens of .current calls with calls to our new utility class.

If you're having this issue then look at your app. How often are you invoking .current? Do you need to do it that often? Can you memo-ize the returned value? Our solution is proprietary and intricate, but boils down to something like this:

module CustomApartmentUtils
  class CurrentTenant
     def current
       Thread.current[:apartment_current_tenant] ||= Apartment::Tenant.current
     end
  end
end

Every time you switch! your tenant you need to populate the thread local variable Thread.current[:apartment_current_tenant] with your tenant details. And then replace all calls in your app with CustomApartmentUtils::CurrentTenant::current.

Good luck!