Galarzaa90 / NabBot

Discord bot for Tibia servers
https://nabbot.xyz
Apache License 2.0
52 stars 25 forks source link

Migration improvements #151

Closed Galarzaa90 closed 5 years ago

Galarzaa90 commented 5 years ago

Migrating a sqlite database was a really long task, taking up to half an hour to migrate the current database (14k chars, 377k level ups, 127k deaths)

A brief explanation of how I was doing things before and after:

Before

  1. Iterate through SQLite character records, inserting one by one to PGSQL, saving the new created id
    • If a name conflict was found, the duplicate entry is ignored, but its id is saved to join with the original entry
    • A pair of queries are executed inside the loop to get the character's death and levelups, gathering a list, associated to their new id
  2. Deaths are sorted by date, in ascending order
  3. Iterate through every death
    • For every iteration, query the new database, to check if the death already exists, ignore if it does
    • Execute an insert query for every non-conflicting death.
  4. Level ups are sorted by date, in ascending order
  5. Iterate through every level up
    • For every iteration, query the new database, to check if the level up already exists, using a 15 second margin
    • Execute an insert query for every non-conflicting level up.

The rest of the migration process contains very few data compared to the rest and it probably won't be optimized... much

After

  1. Remove duplicate characters from old database, merging their child rows
  2. Remove duplicate deaths and orphaned deaths
  3. Remove duplicate level ups and orphaned level ups
  4. Iterate characters on sqlite
    • Generate a mapping of character names to old character ids, e.g. "Galarzaa Fidera" -> 1
    • A list of tuples with character data is generated
  5. List of tuples is passed to the copy_records_to_table function, and does a bulk insert (this is the real performance gain)
  6. The new characters table is fetched entirely
  7. The results are iterated, to generate a mapping of the old character id to the new character id, for instance, if Tschas had the id 243, and in the migrated data it now has the id 231, what the mapping does is associate 243 -> 231
  8. Iterate SQL deaths to gather their data in list of tuples
    • The id map is used to find to what character id the death belongs to in the new data.
    • Since in the new database, deaths have a child table called character_death_killers, we have an incrementing variable that assigns the death's id to the killer entry. This has no way to know if death ids didn't start at 1 or a death wasn't saved for some reason, potentially messing all data.
  9. Use copy_records_to_table to bulk insert deaths
  10. Use copy_records_to_table to bulk insert death killers
  11. Iterate SQL level ups to gather their data in list of tuples
    • The id map is used to find to what character id the level up belongs to in the new data.
  12. Use copy_records_to_table to bulk insert level ups.

The new changes, can reduce the time from 5,400 seconds to as little as 20 seconds.

codeclimate[bot] commented 5 years ago

Code Climate has analyzed commit 5e8093d4 and detected 25 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 25

Note: there are 2 critical issues.

View more on Code Climate.