I need to find out where we store all the user data, retrieve it from the old disk, and migrate it to the new server. Since we don't have an "official" server due to Arbutus limitations, I will test this process on my small instances as well.
In addition, in the future, we might want to store this data elsewhere instead of putting everything on one server.
Why:
The current procedure is to boot from a volume that contains both codes and data. However, doing so means we always have to start with the old image (not a Docker image but an Arbutus cloud compute image) and the old OS. In this case, we have Ubuntu 18.04, which does not support the vGPU driver. It was a huge pain trying to upgrade the OS directly, as I suspect Arbutus does not want us to do it ourselves and has some tricky settings with source.list. Therefore, for this migration, we have to launch a blank new instance and not boot from the old disk. If we have data stored elsewhere, we can always deploy Rodan on a blank new instance, which is safe and flexible.
We should manage our user data similarly to our code and back it up periodically in case something happens during instance reboots or upgrades. Storing the data separately makes it a lot easier to manage.
From my perspective, distributed computing and data management is the way to go. We might try separating data storage as the first step to see if it works and if we like it.
Compute Canada suggests that we migrate from old GPU servers to new vGPU servers because the old hardware is going to retire very soon. However, the vGPU flavors have much smaller vCPU and RAM allocations. For example, our production server used to have 56 vCPUs and 112G RAM, but now we only have 8 vCPUs and 40G RAM at most. My test VM only has 4 vGPUs and 22G RAM which becomes something like 0.5 CPU and 1G memory when it comes to each docker container, and it is very dangerous as service such as redis will shut down itself when it has no enough CPU or memory.
TODO:
[ ] retrieve old data (medium priority)
[ ] [Optional] put them on a separate disk or so (low priority)
I need to find out where we store all the user data, retrieve it from the old disk, and migrate it to the new server. Since we don't have an "official" server due to Arbutus limitations, I will test this process on my small instances as well.
In addition, in the future, we might want to store this data elsewhere instead of putting everything on one server.
Why:
The current procedure is to boot from a volume that contains both codes and data. However, doing so means we always have to start with the old image (not a Docker image but an Arbutus cloud compute image) and the old OS. In this case, we have Ubuntu 18.04, which does not support the vGPU driver. It was a huge pain trying to upgrade the OS directly, as I suspect Arbutus does not want us to do it ourselves and has some tricky settings with source.list. Therefore, for this migration, we have to launch a blank new instance and not boot from the old disk. If we have data stored elsewhere, we can always deploy Rodan on a blank new instance, which is safe and flexible.
We should manage our user data similarly to our code and back it up periodically in case something happens during instance reboots or upgrades. Storing the data separately makes it a lot easier to manage.
From my perspective, distributed computing and data management is the way to go. We might try separating data storage as the first step to see if it works and if we like it.
TODO: