Clinical-Genomics / trailblazer

Keep track of and manage analyses
MIT License
5 stars 2 forks source link

Bug: tower analysis status update #434

Closed seallard closed 2 months ago

seallard commented 2 months ago

Tower analyses are reported incorrectly as cancelled.

https://github.com/Clinical-Genomics/trailblazer/pull/423 did not reflect that the logic for updating the status of an analysis from its constituent jobs depends on the workflow manager.

Basically, the Tower statuses differs from the slurm ones and requires a separate translation.

There is a hard coded assumption that analyses run via Tower has a separate QC step. So once an analysis run in Tower completes, the normal logic does not apply - instead the status of the analysis is set to QC. Here is the relevant logic

    @property
    def status(self) -> str:
        """Returns the status of an analysis (also called workflow in NF Tower)."""
        status: str = TOWER_WORKFLOW_STATUS.get(
            self.response.workflow.status, TrailblazerStatus.ERROR
        )

        # If the whole workflow (analysis) is completed set it as QC instead of COMPLETE
        if status == TrailblazerStatus.COMPLETED:
            return TrailblazerStatus.QC
        return status

I think this is poor design. It tightly couples the logic of how we define the status of an analysis based on the status of its jobs to the workflow manager used. In the future, are we sure that all analyses via Tower will have a separate QC step like this? Is this how we want to model that?

Alternatives: Given that the output path is known, can a QC job be scheduled as part of the analysis from the get-go?

seallard commented 2 months ago

Quick patch: ensure the status update uses the old Tower logic.