Closed argush3 closed 1 week ago
Previously for firms data migration, we had run Prefect workflows locally pointing at the environment of interest.
For corps, we are looking at standing up the required data migration infrastructure in GCP for the following reasons:
Some work was done as a part of this ticket to figure out what is involved with getting the data migration pipelines running in GCP. The work was mainly around the prefect infra as the other infra is probably simpler. Note that there is an assumption that all BE LEAR services will be running in GCP by the time we start migrating corps data in Prod for MVP launch.
I was able to get all the infra required to run a hello world style Prefect workflow in GCP. A detailed doc of the steps taken to setup the required infra can be found here.
Below are a summary of GCP resources required for dev/test/prod.
Prefect Infra
Other Infra
Dev/test/prod environments will need the temporary infra provided in the infra summary(prefect + other) sections.
The dev & test environments can have the prefect specific infra recreated or scaled down on an as need basis if cost is an issue. Have tested this and is fairly quick to recreate the environment from scratch via scripts. The main resource that may need to keep running is maybe the COLIN extract cloud sql db and practice target LEAR db(dev only).
Prod related infra can be stood up when we get closer to MVP launch.
Moving to done as analysis has been completed.
Confirmation around whether the proposed temporary infra is acceptable and details around the setup will be discussed with Thor and Patrick outside of this ticket.
The data migration process will need to have Prefect related temporary infrastructure to support batch data loading as well on demand data loading for corps data.
Besides the required infrastructure for Prefect, there will also be a need to host an instance of the COLIN extract postgres db as well as a test target LEAR db. Both of these will be temporary as well.
The work in this ticket is to establish the required infra and the general setup of how things will look for the different environments.
TODOs