scimma / blast

Django web app for the automatic characterization of supernova hosts
MIT License
1 stars 3 forks source link

Revise transient progress reporting #238

Closed manning-ncsa closed 6 months ago

manning-ncsa commented 6 months ago

The progress bar shown on the /transients page should be based on the fraction of tasks in a transient workflow that are still possible to process instead of the fraction of tasks that have the status "processed". In other words, "progress" should refer to the data processing activity, not the amount of information that Blast can in general calculate for a transient.

Look at the example of transient 2024ijh: The "validate local photometry" task results in a "no valid phot" state, which precludes the processing of the "Local host SED inference" task. If all possible tasks have completed and no further data processing will take place for that transient, then the progress should be 100%.

djones1040 commented 6 months ago

see PR #239

manning-ncsa commented 6 months ago

@djones1040 The get_progress() function you improved is only executed by the _run_process() function when a task is executed, and I see that the "Log transient processing status" periodic system task is disabled (app/host/system_tasks.py:263). Should we reconcile and consolidate these two progress updater functions (perhaps by calling get_progress() from the periodic task and enabling it), or does it make more sense to define a summary progress task and add it as the final task in the transient workflow app/host/workflow.py:80? The latter sounds cleaner, since by design the workflow's Celery tasks will all execute even when the results of individual processing stages block subsequent dependent stages, ensuring that the final progress calculation is executed.

djones1040 commented 6 months ago

unfortunately if we add it as a final task, it will only update the progress once, so it'll only give the progress when it's 0 or 100%. The system tasks one does all transients at once so it was getting crazy slow as the database size increases, so I think that one should be gone for good.

When you say (perhaps by calling get_progress() from the periodic task and enabling it) not sure if I understand -- is this different from what it's currently doing?

manning-ncsa commented 6 months ago

The system tasks one does all transients at once so it was getting crazy slow as the database size increases, so I think that one should be gone for good.

Roger that.

unfortunately if we add it as a final task, it will only update the progress once, so it'll only give the progress when it's 0 or 100%.

There would still be the call to get_progress() in each task's _run_process() function, so it should progress with each task execution.

djones1040 commented 6 months ago

oh I don't think we need it at the end of the workflow if it's being run after the final task already though right?

manning-ncsa commented 6 months ago

Your update looks like it has fixed this issue. I retriggered a set of recent 2024i* transients and monitored their Celery task progress while refreshing the /transients page, and the progress bars advanced properly. Workflows (e.g. 2024inn, 2024ino, 2024inp) aborted due to "no GHOST match" or "no valid phot" are shown as completed, and they also show that even though host matching failed, the local aperture photometry processing stage was not blocked.

image

djones1040 commented 6 months ago

thanks for checking!