CederGroupHub / alabos

AlabOS: Managing the workflows in the Autonomous lab
https://cedergrouphub.github.io/alabos/
MIT License
25 stars 6 forks source link

Alabos hot restart #75

Open odartsi opened 3 months ago

odartsi commented 3 months ago

Steps to follow:

class TaskManager:
    def __init__(self, live_time: float | None = None):
          ...
          self.live_time = live_time

      def run(self):
          start = time.time()
          while (time.time() - start) < self.live_time:
              self._loop()

Then in the launch_lab function, we will need to start them process if it exits normally.

bernardusrendy commented 3 months ago

New problem: For tasks that have already been created and under the status WAITING/READY, it has not been ran in dramatiq actor run_task.

Note that load_definition has not been called for those tasks.

Therefore, these WAITING tasks have a risk of mismatch in tasks parameters with what was defined when it was submitted.

For example, if we submitted a sample with Heating(time=720) and it is WAITING. We then update the Heating which does not accept time argument anymore, the old sample will run into an error.

Proposed solution: This problem is fundamentally about versioning. We will solve this by keeping a local copy of older versions for each update of the task.

bernardusrendy commented 3 months ago

TODOs:

  1. Following this update, alab_one device definition should be updated to not contain any threading.
  2. More to come..