Open jeremy-thomas-roc opened 1 year ago
A related issue perhaps is that when my vertex jobs have ended (crashed, or I ended them) prefect won't see that, and have them in the running state. If I then cancel them, it stays in "Canceling" mode forever. This behavior seems worse in prefect 3 than it was in 2 I believe.
As the title states, when a flow is manually canceled, say through the UI, the training job persists in Vertex. This requires the user to go manually cancel the training job in Vertex, or it stays running indefinitely.
Expectation / Proposal
Canceling a flow should cancel the training job
Traceback / Example
Not sure how to provide an example, but all of our jobs run using this infrastructure, and it occurs on all of them, so I am confident this is an error within the infrastructure block and not anywhere else in our workflow.
I'd be happy to help, but I'm not sure I have the technical expertise to dive into the Prefect cancelation system and workflow. If it is a bug contained to this repo, I may be able to figure it out, but I may need a point in the right direction.