Use object orientated hazenlib

laurencejackson commented 2 years ago

Uses external hazenlib instead of internally copied one.

sophie22 commented 2 years ago

@laurencejackson could you please provide a brief description of the 'new way' to interact with Hazen via the tasks.py? Mostly interested in how to pass arguments, such as the list of filepath to the hazen tasks and how to pass task_variables, such as slice width to SNR task?

laurencejackson commented 2 years ago

@sophie22 , the new way is identical to the old way! but unlucky for us the old way didn't pass a slice width either.

The thing you want to look at is the main function in https://github.com/GSTT-CSC/hazen/blob/webapp-core-tasks/hazenlib/__init__.py

In the webapp, when a task is executed the dicom paths are passed to produce_report as image_files. The simple solution for now is to pass an additional optional argument of slice_width and when we call this task we copy how it works in hazenlib using an if condition. So in reports/routes.py we add something like:

slice_width: float = <get from UI or db>

flash(f'Starting process: {task_name}', 'info')

if task_name == 'snr':
    celery_job = produce_report.delay(task_name=task_name, user_id=current_user.id, image_files=image_files, series_id=series.id, slice_width=slice_width)
else:
    celery_job = produce_report.delay(task_name=task_name, user_id=current_user.id, image_files=image_files, series_id=series.id)

current_app.logger.info(f"Task performed successfully!")
flash(f'Completed processing: {task_name}')
current_app.logger.info(celery_job)

and in produce_report we modiy so that:

if task_name == 'snr':
    result = task.run(slice_width)
else:
    result = task.run()

I know this isn't the prettiest solution but while we focus on MVP it should be fine. If you have an alternative neater idea when please try it out!

sophie22 commented 2 years ago

Thank you, this is really helpful! Another question I have is whether the celery_job somehow stores the result of the task? I think when you were debugging it this result dictionary was available in a field. Could you tell me what it was called?

Reason I'm asking is because I want to move the database update part fully from tasks.py to routes.py but for that I need the result somehow.

laurencejackson commented 2 years ago

Correct, the produce_report.delay method returns an object where the result property contains the dictionary you're looking for. But before doing that it might be worth discussing why you want to move the database update out of the celery framework. My understanding is that we want it in there so we can take advantage of the asynchronous processing of multiple jobs that celery provides?

sophie22 commented 2 years ago

I see, so it makes sense to keep it in tasks.py and the corresponding update in the Series table of the field 'has_result' should also be moved to the tasks.py so they are updated together when the task has finished.

The next page in the UI flow is to redirect to the result page that displays the result values from the Hazen task. at the minimum, I need the newly created reports ID from the db to know what to display to the user. Is there any way Celery can communicate that?

laurencejackson commented 2 years ago

For the first bit, yes that makes sense.

For the second bit, currently, we return just the result dictionary as below. You should be able to return additional variables if required. I would expect anything returned here to be available in the result property of the returned AsyncResult object.

If you have any more questions we should continue this discussion in an issue rather than a closed PR! If you need to, open an issue and reference #71.

GSTT-CSC / hazen-web-app

Use object orientated hazenlib #71