Open ccjoel opened 11 months ago
Re https://github.com/jataware/dojo/issues/202 and this:
I explored using GET /job/{uuid}/{job_name}
instead of GET /job/fetch
for the Data Modeling processing step. What I found was that the current use of RQ (I believe these come from RQ) statuses are not consistent enough to make this helpful. For example, when the flowcast job returns an error, it uses the status finished
, but the results object is populated with { error: '', message... }
.
If we want to make a wholesale switch to using this endpoint (which I'm in support of, as I think it will make things more standardized and clearer), we should also put in some effort to make upstream changes to have consistent statuses returned by the various jobs. This may also involve some work on the Flowcast and Elwood jobs to raise errors appropriately and then respond with the correct statuses, as well as making sure these stay consistent throughout the entire flow.
Summary
During Dataset Registration:
Handling job status and errors has been problematic on the
Dataset Transformation
step.Reason:
The UI fetches the status/result of the Dataset Transform jobs using a different endpoint than the rest of the jobs (ie compared to
file_conversion
,run_elwood
, etc), which adds limitations on error handling. The Dojo UI is using/job/fetch/id
call to await for the jobs to finish, instead of the/job/{dataset_id}/{job_name}
endpoint, like the rest of the jobs in the flow.The
/job/fetch
endpoint is intended to fetch the final result, but does not contain data on the job status (is it stuck in queued? started? finished ? error?).We're not sure if this was an oversight, or if there was a limitation using the previous patterns. We can look into this, with the possibility to increase stability on this page if we replace the endpoints/handling on this step.
Proposed Solution
Use new
GET /job/uuid/job_name
endpoint, added in #201 .We should migrate to using the new endpoints instead of the
/job/fetch
ones, in order to follow the job status, alongside the result. These newGET
, side-effects-free, endpoints to the Dojo API should allow us to better handle errors on the Dataset Transform page.Changing the endpoint called from UI is trivial, but changing how we handle the result requires additional code changes- the result won't be the main response body, but nested; the
status
property will indicate if the job is queued, started, or if if it has finished- as well as contain the error if it has failed.More Details | Noise
On the Dataset Register page, we have multiple long-running tasks, which we queue to await for them to finish.
For all the register-related jobs, we call this endpoint to enqueue them:
POST /job/id/task_name.fn_name
Example:
For this same endpoint, we await for the result using the same url:
For the dataset transformation page, the latter job-await endpoint is replaced with the following pattern:
Note the
/fetch/
in the path, as well as the underscore_
that merges the uri path of the dataset uuid and job_name.