Refactoring the Synchronization of PyPI, this should fix #4 and will hopefully fix #11 .
The new refactor uses redis to store synchronization state. The migrations will pull the existing data out of the database and put it into redis to bootstrap the process.
Previous synchronization was optimized for bulk imports, this made sense at the time when I was trying to get ~18k packages into the system as quickly as possible, but some serious faults were found as things ran. The new system is optimized for the common case (i.e. change log processing), and it also should get rid of the PostgreSQL deadlocks.
Refactoring the Synchronization of PyPI, this should fix #4 and will hopefully fix #11 .
The new refactor uses redis to store synchronization state. The migrations will pull the existing data out of the database and put it into redis to bootstrap the process.
Previous synchronization was optimized for bulk imports, this made sense at the time when I was trying to get ~18k packages into the system as quickly as possible, but some serious faults were found as things ran. The new system is optimized for the common case (i.e. change log processing), and it also should get rid of the PostgreSQL deadlocks.