brendon / acts_as_list

An ActiveRecord plugin for managing lists.
http://brendon.github.io/acts_as_list/
MIT License
2.05k stars 356 forks source link

Enhancement: Execute show table queries through schema cache class #396

Closed gurukannappa closed 5 months ago

gurukannappa commented 3 years ago

Background:

Executing connection.data_source_exists? or connection.table_exists? will fire SHOW TABLES LIKE 'table_name' query. When this is invoked n times, then n number of show table queries will be fired.

Enhancement:

Starting ActiveRecord version >= 3.2 schema cache class is used as a gateway for executing all show table queries. Example: connection.schema_cache.table_exists? will fire show query table once and will cache the results in its objects. Further invocations of show table queries are referred from the objects rather than firing every time to the database.

Benefits

For a larger production system with several databases, this would reduce several show table queries getting fired to databases and reduces the load on the system.

PS: My First OpenSource pull request please let me know for any feedbacks. :)

brendon commented 3 years ago

Hi @gurukannappa, thanks for this :)

I notice that this call is cached to @sequential_updates and should only run once on first boot. Are you seeing it run many times? I suppose it'll run once per instance of your app, but that will happen regardless of a schema case.

I can't see anything wrong with your approach (it seems to test out well) but I'm interested to see if you can find out why you're getting it called so many times. Oh, come to think of it, I suspect it's called once per acts_as_list call. But that'll happen anyway unless you have many seperate lists on the one table?

Hopefully we have tests around the sequential_updates functionality already to test that your change still bears the same results. Are you able to check this out for me? :)

gurukannappa commented 3 years ago

Hi @brendon

Thank you very much for your time in looking into this :)

As a reference for discussion i will use a sample table with which i have experimented.

class ChennaiCity < ActiveRecord::Base
  self.table_name = "chennai_cities"
  acts_as_list
end

Table Schema

  show create table chennai_cities \G;
  *************************** 1. row ***************************
         Table: chennai_cities
  Create Table: CREATE TABLE `chennai_cities` (
    `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
    `created_at` datetime NOT NULL,
    `updated_at` datetime NOT NULL,
    `position` int(11) DEFAULT NULL,
    PRIMARY KEY (`id`)
  ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
  1 row in set (0.00 sec)

Discussions

@sequential_updates -> Regardless of schema cache, sequential_updates? function is getting called on every object of ChennaiCity model when remove_from_list is invoked.

Reason:

module ActiveRecord::Acts::List::SequentialUpdatesMethodDefiner #:nodoc:
  def self.call(caller_class, column, sequential_updates_option)
    caller_class.class_eval do
      define_method :sequential_updates? do
        if !defined?(@sequential_updates)
          if sequential_updates_option.nil?

In the above snippet we are defining the method sequential_updates? into the caller_class which is a child class(ChennaiCity) of ActiveRecord::Base so every object of ChennaiCity will have a separate copy of @sequential_updates variable leading to n number of calls depending on the number of objects created.

Let's take a sample dataset.

id name created_at updated_at position
71 Tambaram 2021-10-17 01:24:42 2021-10-17 01:24:42 1
72 Central 2021-10-17 01:24:42 2021-10-17 01:24:42 2
73 Guindy 2021-10-17 01:24:42 2021-10-17 01:24:42 3
74 Porur 2021-10-17 01:24:42 2021-10-17 01:24:42 4
75 Perungudi 2021-10-17 01:24:42 2021-10-17 01:24:42 5
76 Velachery 2021-10-17 01:24:42 2021-10-17 01:24:42 6
77 Medavakkam 2021-10-17 01:24:42 2021-10-17 01:24:42 7
78 Ramapuram 2021-10-17 01:24:42 2021-10-17 01:24:42 8
79 Sholinganallur 2021-10-17 01:24:42 2021-10-17 01:24:42 9
80 Vandalur 2021-10-17 01:24:42 2021-10-17 01:24:42 10
ChennaiCity.all.each do |c|
  c.remove_from_list
end
****************
Remove from List called
Define Method Invoked for -> sequential_updates?
Invoked on Object: #<ChennaiCity id: 71, name: "Tambaram", created_at: "2021-10-17 01:40:28", updated_at: "2021-10-17 01:40:28", position: 1>
!defined?(@sequential_updates) -> nil
Calling caller_class.connection.table_exists?
Initialising @sequential_updates: instance-variable
****************
Remove from List called
Define Method Invoked for -> sequential_updates?
Invoked on Object: #<ChennaiCity id: 72, name: "Central", created_at: "2021-10-17 01:40:28", updated_at: "2021-10-17 01:40:28", position: 2>
!defined?(@sequential_updates) -> nil
Calling caller_class.connection.table_exists?
Initialising @sequential_updates: instance-variable
****************
Remove from List called
Define Method Invoked for -> sequential_updates?
Invoked on Object: #<ChennaiCity id: 73, name: "Guindy", created_at: "2021-10-17 01:40:28", updated_at: "2021-10-17 01:40:28", position: 3>
!defined?(@sequential_updates) -> nil
Calling caller_class.connection.table_exists?
Initialising @sequential_updates: instance-variable
****************
Remove from List called
Define Method Invoked for -> sequential_updates?
Invoked on Object: #<ChennaiCity id: 74, name: "Porur", created_at: "2021-10-17 01:40:28", updated_at: "2021-10-17 01:40:28", position: 4>
!defined?(@sequential_updates) -> nil
Calling caller_class.connection.table_exists?
Initialising @sequential_updates: instance-variable
****************
.
.

Attaching the mysql query logs while removing items from the list which has show table & show keys queries

SELECT     argument  FROM     mysql.general_log order by event_time desc limit 100                                                                                    |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 100                                                 |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 10) |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 99                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 9)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 98                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 8)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 97                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 7)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 96                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 6)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 95                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 5)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 94                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 4)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 93                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 3)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 92                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 2)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| COMMIT                                                                                                                                                                |
| SELECT COUNT(*) FROM `chennai_cities`  WHERE (1 = 1) AND (`chennai_cities`.`position` = 0)                                                                            |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 91                                                  |
| BEGIN                                                                                                                                                                 |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 1)  |
| SHOW KEYS FROM `chennai_cities`                                                                                                                                       |
| SHOW TABLES LIKE 'chennai_cities'                                                                                                                                     |
| SELECT `chennai_cities`.* FROM `chennai_cities`                                                                                                                       |
| SELECT     argument  FROM     mysql.general_log order by event_time desc limit 100                                                                                    |
| SELECT     argument  FROM     mysql.general_log order by event_time desc limit 100  

So there are two things that can be seen from the above experiment.

  1. Making sequential_updates? as a class level object to prevent multiple invocations.
  2. using schema cache to reduce multiple show table query executions.

Background on how much impact for schema cache query:

During deployments several process restarts at the same time resulting in initialisation of rails application which fires up several show table queries. Such parallel show table queries spike up DBs CPU and results in a severe performance degradation of all the queries. lets say 20 instances has 10 passenger process that has around 10 shards (multiple databases) so that comes roughly around 2000 show table queries.

So, even to prevent any show table queries at all in general from rails perspective, i have added a some improvements on top of existing schema cache class to load schema cache data from offline dump which would prevent any show table queries at all during process startup. So calling connection.schema_cache.table_exists? would refer to schema from the offline dump instead of firing even a single time.

i have taken some call graphs of this experiments but not able to attach in here.

Let me know your thoughts @brendon would be happy to collaborate. :)

gurukannappa commented 3 years ago

Attached Call graph for remove_from_list operation https://github.com/gurukannappa/readme/blob/master/output.svg

brendon commented 3 years ago

Thanks @gurukannappa, I had no idea this was happening! It's pretty bad given it affects people who don't even want to use sequential updates.

I think we should fix the issue properly by generating checking for the index only once per app boot, then caching it with a class variable @@ rather than an instance variable. So there is a couple of tasks:

I'm comfortable with this occurring once per instance of the app still vs caching it offline. It's less messy that way :)

Are you happy to look into this? My initial trial attempt is here: https://github.com/brendon/acts_as_list/tree/fix-sequential-updates

brendon commented 5 months ago

Closing this for now as I don't have the time to work on it myself. You might like to try https://github.com/brendon/positioning instead. Conversion from acts_as_list should be fairly straightforward.