Closed gurukannappa closed 5 months ago
Hi @gurukannappa, thanks for this :)
I notice that this call is cached to @sequential_updates
and should only run once on first boot. Are you seeing it run many times? I suppose it'll run once per instance of your app, but that will happen regardless of a schema case.
I can't see anything wrong with your approach (it seems to test out well) but I'm interested to see if you can find out why you're getting it called so many times. Oh, come to think of it, I suspect it's called once per acts_as_list
call. But that'll happen anyway unless you have many seperate lists on the one table?
Hopefully we have tests around the sequential_updates
functionality already to test that your change still bears the same results. Are you able to check this out for me? :)
Hi @brendon
Thank you very much for your time in looking into this :)
As a reference for discussion i will use a sample table with which i have experimented.
class ChennaiCity < ActiveRecord::Base
self.table_name = "chennai_cities"
acts_as_list
end
Table Schema
show create table chennai_cities \G;
*************************** 1. row ***************************
Table: chennai_cities
Create Table: CREATE TABLE `chennai_cities` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`position` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
1 row in set (0.00 sec)
Discussions
@sequential_updates
-> Regardless of schema cache, sequential_updates?
function is getting called on every object of ChennaiCity
model when remove_from_list
is invoked.
Reason:
module ActiveRecord::Acts::List::SequentialUpdatesMethodDefiner #:nodoc:
def self.call(caller_class, column, sequential_updates_option)
caller_class.class_eval do
define_method :sequential_updates? do
if !defined?(@sequential_updates)
if sequential_updates_option.nil?
In the above snippet we are defining the method sequential_updates?
into the caller_class
which is a child class(ChennaiCity) of ActiveRecord::Base so every object of ChennaiCity will have a separate copy of @sequential_updates
variable leading to n number of calls depending on the number of objects created.
Let's take a sample dataset.
id | name | created_at | updated_at | position |
---|---|---|---|---|
71 | Tambaram | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 1 |
72 | Central | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 2 |
73 | Guindy | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 3 |
74 | Porur | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 4 |
75 | Perungudi | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 5 |
76 | Velachery | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 6 |
77 | Medavakkam | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 7 |
78 | Ramapuram | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 8 |
79 | Sholinganallur | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 9 |
80 | Vandalur | 2021-10-17 01:24:42 | 2021-10-17 01:24:42 | 10 |
ChennaiCity.all.each do |c|
c.remove_from_list
end
****************
Remove from List called
Define Method Invoked for -> sequential_updates?
Invoked on Object: #<ChennaiCity id: 71, name: "Tambaram", created_at: "2021-10-17 01:40:28", updated_at: "2021-10-17 01:40:28", position: 1>
!defined?(@sequential_updates) -> nil
Calling caller_class.connection.table_exists?
Initialising @sequential_updates: instance-variable
****************
Remove from List called
Define Method Invoked for -> sequential_updates?
Invoked on Object: #<ChennaiCity id: 72, name: "Central", created_at: "2021-10-17 01:40:28", updated_at: "2021-10-17 01:40:28", position: 2>
!defined?(@sequential_updates) -> nil
Calling caller_class.connection.table_exists?
Initialising @sequential_updates: instance-variable
****************
Remove from List called
Define Method Invoked for -> sequential_updates?
Invoked on Object: #<ChennaiCity id: 73, name: "Guindy", created_at: "2021-10-17 01:40:28", updated_at: "2021-10-17 01:40:28", position: 3>
!defined?(@sequential_updates) -> nil
Calling caller_class.connection.table_exists?
Initialising @sequential_updates: instance-variable
****************
Remove from List called
Define Method Invoked for -> sequential_updates?
Invoked on Object: #<ChennaiCity id: 74, name: "Porur", created_at: "2021-10-17 01:40:28", updated_at: "2021-10-17 01:40:28", position: 4>
!defined?(@sequential_updates) -> nil
Calling caller_class.connection.table_exists?
Initialising @sequential_updates: instance-variable
****************
.
.
Attaching the mysql query logs while removing items from the list which has show table & show keys queries
SELECT argument FROM mysql.general_log order by event_time desc limit 100 |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 100 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 10) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 99 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 9) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 98 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 8) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 97 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 7) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 96 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 6) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 95 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 5) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 94 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 4) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 93 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 3) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 92 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 2) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| COMMIT |
| SELECT COUNT(*) FROM `chennai_cities` WHERE (1 = 1) AND (`chennai_cities`.`position` = 0) |
| UPDATE `chennai_cities` SET `position` = NULL, `updated_at` = '2021-10-17 02:03:51' WHERE `chennai_cities`.`id` = 91 |
| BEGIN |
| UPDATE `chennai_cities` SET `position` = (`chennai_cities`.`position` - 1), `updated_at` = '2021-10-17 02:03:51' WHERE (1 = 1) AND (`chennai_cities`.`position` > 1) |
| SHOW KEYS FROM `chennai_cities` |
| SHOW TABLES LIKE 'chennai_cities' |
| SELECT `chennai_cities`.* FROM `chennai_cities` |
| SELECT argument FROM mysql.general_log order by event_time desc limit 100 |
| SELECT argument FROM mysql.general_log order by event_time desc limit 100
So there are two things that can be seen from the above experiment.
sequential_updates?
as a class level object to prevent multiple invocations. Background on how much impact for schema cache query:
During deployments several process restarts at the same time resulting in initialisation of rails application which fires up several show table queries. Such parallel show table queries spike up DBs CPU and results in a severe performance degradation of all the queries. lets say 20 instances has 10 passenger process that has around 10 shards (multiple databases) so that comes roughly around 2000 show table queries.
So, even to prevent any show table queries at all in general from rails perspective, i have added a some improvements on top of existing schema cache class to load schema cache data from offline dump which would prevent any show table queries at all during process startup. So calling connection.schema_cache.table_exists?
would refer to schema from the offline dump instead of firing even a single time.
i have taken some call graphs of this experiments but not able to attach in here.
Let me know your thoughts @brendon would be happy to collaborate. :)
Attached Call graph for remove_from_list operation https://github.com/gurukannappa/readme/blob/master/output.svg
Thanks @gurukannappa, I had no idea this was happening! It's pretty bad given it affects people who don't even want to use sequential updates.
I think we should fix the issue properly by generating checking for the index only once per app boot, then caching it with a class variable @@
rather than an instance variable. So there is a couple of tasks:
acts_as_list
call.acts_as_list
doesn't support multiple calls to acts_as_list
on one class so this cache key will be fine.active_record_version_is?
which is an instance method currently. This could be added as a class method on the calling class (not ActiveRecord::Base
) so we can access it at the class level. Other usages would need to be adjusted to call self.class.active_record_version_is?
.I'm comfortable with this occurring once per instance of the app still vs caching it offline. It's less messy that way :)
Are you happy to look into this? My initial trial attempt is here: https://github.com/brendon/acts_as_list/tree/fix-sequential-updates
Closing this for now as I don't have the time to work on it myself. You might like to try https://github.com/brendon/positioning instead. Conversion from acts_as_list
should be fairly straightforward.
Background:
Executing
connection.data_source_exists?
orconnection.table_exists?
will fireSHOW TABLES LIKE 'table_name'
query. When this is invoked n times, then n number of show table queries will be fired.Enhancement:
Starting ActiveRecord version >= 3.2 schema cache class is used as a gateway for executing all show table queries. Example:
connection.schema_cache.table_exists?
will fire show query table once and will cache the results in its objects. Further invocations of show table queries are referred from the objects rather than firing every time to the database.Benefits
For a larger production system with several databases, this would reduce several show table queries getting fired to databases and reduces the load on the system.
PS: My First OpenSource pull request please let me know for any feedbacks. :)