This PR is intended to demonstrate that the mock deepssm task has access to a GPU when run from the "gpu" queue. These changes can be reverted when the real deepssm task is implemented, but for now this version can be used to test the GPU availability.
In order to demonstrate a difference between the two queues, the "mock-deepssm" endpoint will spawn a task on each queue. To persist the feedback from each task, two TaskProgress objects are used. These TaskProgress objects are not associated with any project, so one migration needs to be applied to allow project=null on TaskProgress objects. Since we need a migration anyway, I added a field "message" to the model (similar to the "error" field but without the connotation of failure). The mock deepssm task will save a string to this field which describes the availability of a GPU device.
Once merged, the expected behavior is as follows:
User submits a post request to api/mock-deepssm
Two tasks are spawned: One deepssm task sent to the "gpu" queue and one deepssm task sent to the default queue ("celery")
User receives a response similar to the following, containing two TaskProgress ids:
The task sent to the default queue will run and save the following message to its TaskProgress object: "DeepSSM task not implemented; testing GPU availability. GPU available = False."
The task sent to the "gpu" queue will stay in the queue until the "manage_workers" task is spawned by Celery beat, whereupon the GPU worker on AWS will be started. The GPU worker will pick up the waiting task and save a success message to its TaskProgress object, similar to the following: "DeepSSM task not implemented; testing GPU availability. GPU available = True. Found device [device_name]."
The user can compare these results by making get requests to api/v1/task-progress/5 and api/v1/task-progress/6.
This PR is intended to demonstrate that the mock deepssm task has access to a GPU when run from the "gpu" queue. These changes can be reverted when the real deepssm task is implemented, but for now this version can be used to test the GPU availability.
In order to demonstrate a difference between the two queues, the "mock-deepssm" endpoint will spawn a task on each queue. To persist the feedback from each task, two TaskProgress objects are used. These TaskProgress objects are not associated with any project, so one migration needs to be applied to allow
project=null
on TaskProgress objects. Since we need a migration anyway, I added a field "message" to the model (similar to the "error" field but without the connotation of failure). The mock deepssm task will save a string to this field which describes the availability of a GPU device.Once merged, the expected behavior is as follows:
api/mock-deepssm
api/v1/task-progress/5
andapi/v1/task-progress/6
.