Open odartsi opened 4 months ago
New problem:
For tasks that have already been created and under the status WAITING/READY, it has not been ran in dramatiq actor run_task
.
Note that load_definition
has not been called for those tasks.
Therefore, these WAITING tasks have a risk of mismatch in tasks parameters with what was defined when it was submitted.
For example, if we submitted a sample with Heating(time=720)
and it is WAITING. We then update the Heating which does not accept time
argument anymore, the old sample will run into an error.
Proposed solution: This problem is fundamentally about versioning. We will solve this by keeping a local copy of older versions for each update of the task.
TODOs:
Steps to follow:
[x] 1. Remove https://github.com/CederGroupHub/alabos/blob/main/alab_management/device_manager.py completely. We will create the instances of devices in each task every time the task occupies the device. (@idocx )
[ ] 2. Implement reload option for importing alab_one package. Currently, the alab_one package is imported to AlabOS process via https://github.com/CederGroupHub/alabos/blob/main/alab_management/utils/module_ops.py#L12. We will need to implement something similar to
importlib.reload
function. The new function should have such signature. (@bernardusrendy )[x] 3. Implement process restart for AlabOS. This will be done via https://github.com/CederGroupHub/alabos/blob/main/alab_management/scripts/launch_lab.py#L70. Currently, there are four processes running. We will only need to restart them at a regular interval by adding a
live_time
argument to each manager class, e.g., (@odartsi )Then in the
launch_lab
function, we will need to start them process if it exits normally.