chamilo / chamilo-lms

Chamilo is a learning management system focused on ease of use and accessibility
https://chamilo.org
GNU General Public License v3.0
811 stars 481 forks source link

[C2] Migration to 2.0 takes too long because of resource_link.display_order #5929

Open ywarnier opened 1 week ago

ywarnier commented 1 week ago

The migration process from 1.11.x to 2.0 takes very long if there are many attendances (it may be the case for other things as well but we've seen it for attendances).

The issue stems from Gedmo trying to get the max display_order of an item type

FROM resource_link r0_ 
WHERE (
  r0_.c_id = ? AND 
  r0_.session_id = ? AND 
  r0_.usergroup_id IS NULL AND 
  r0_.group_id IS NULL AND
  r0_.user_id IS NULL AND
  r0_.resource_type_group = ?) 
AND (r0_.deleted_at IS NULL) 
LIMIT 1

We've already optimized this (which moved the needle from 7s to 1.4s) with an index, but this is not enough. On migrations where there are 100.000 records of one resource type, just migrating this resource type can take several days.

A solution to this would be to avoid the query altogether : the query just looks for the biggest display_order for a particular subset of resource_link records.

If we are migrating attendances, the resource_type_group is just "attendances" and we could build a structured, hierarchical array ([c_id][session_id][usergroup_id][group_id][user_id]) on the basis of a simple :

SELECT *
FROM resource_link r0_ 
WHERE r0_.resource_type_group = 'attendances'
AND r0_.deleted_at IS NULL

Then we could just go through the results, build the array once with the current display_order values and increase those values as we insert more records into the resource_link table.

The issue at the moment is we have not yet figured out how to supersede Gedmo to ensure it uses our method rather than the inefficient default one.