Mines-Paristech-Students / Portail-des-eleves

The Mines Paristech's student life management application
23 stars 6 forks source link

[RFC] Migrating database of portail V1 to new version #349

Open alexianelaude opened 2 years ago

alexianelaude commented 2 years ago

Summary

This document details a migration plan to transfer part of the old portail's database to the new version. In particular, we will explicit which tables should be migrated, and how we intend on proceeding. For some tables, a discussion on whether or not we should migrate is required.

Motivation

The old version of the 'portail' has been running since ~2010, and we do not wish to wipe out all the history of this website when deploying the new version. In particular, it is necessary that all users keep their account, along with the information associated with their profile, so as not to exclude any 'mineur' from this students' website. It would also be useful to migrate the current associations and their members (current and past), which would spare all associations from recreating their page from scratch. Finally, we should consider migrating media, polls, past events, chat messages, etc, to which some users are emotionally attached. This is important if we want all users to turn towards the new version without regretting the past one.

Guide-level explanation

In this section, we will discuss which tables should be migrated, which tables might require a migration and which tables shouldn't be migrated.

To be migrated:

Could be migrated:

Should not be migrated

More anecdotic tables

Reference-level explanation

The method we have chosen to run the migration is to generate SQL dumps of the v1 portail database and run them on the database of the new portail, once deployed. We will not be creating tables/ sequences/ ... but only inserting values inside the v2 database. We will proceed table by table as listed above.

Because model structures between the old and new version differ in some ways, we will first run SQL queries on the v1 tables in order to:

Some tables might require more complex manipulations, which can be run using a Python script. The script could either directly modify the .sql file, or we could export the table in .csv format instead, which might be easier to manipulate in Python. This will especially be needed to populate the current_academic_year and student_type fields of the authentication_user table from v2. Indeed, the value of current_academic_year can be deduced by computing today.YEAR - year_of_entry, but must then be converted into a string based on 5 different cases ('1A', '2A', 'cesurien', '3A', 'diplomé') (NB: we will have to assume all students have done a gap year, because we are missing the information). The student_type can be deduced using the boolean values est_AST and est_isupfere from table V1.

Users should also simultaneously be created in the SSO server database. Because it also uses a sql database, the same method can be applied.

Drawbacks, rationale and alternatives

There is no reason not to migrate users and their profile information.

For the rest, a lot of data from the old portail is outdated, including a lot of media, events, etc but even some associations. The deployment of the new version could be an opportunity to filter out useless data which tends to pollute the website. In this case an automatic migration is perhaps not the best solution, and we could instead ask association members to choose for their own association which data they wish to keep.

Unresolved questions

See "Guide-level-explanation" for details on tables whose migration is open to debate. This RFC can in any case be run in separate phases (each table can be migrated independently from the others, provided there is no foreign key link).