0xPolygonZero / zk_evm

Apache License 2.0
80 stars 38 forks source link

Implement graceful shutdown of proving tasks in case of the error #584

Open atanmarko opened 1 month ago

atanmarko commented 1 month ago

If we prove range of blocks and one block fails, no need to continue running parallel block proving tasks. Abort execution gracefully.

Nashtare commented 1 month ago

Should also include the abort_signal carried as AtomicBool in the worker processes, that we did implement in the evm_arithmetization side but never finished implementing the higher logic in zero-bin for shutdown signaling propagation upon error.

atanmarko commented 1 month ago

Should also include the abort_signal carried as AtomicBool in the worker processes, that we did implement in the evm_arithmetization side but never finished implementing the higher logic in zero-bin for shutdown signaling propagation upon error.

We have two separate scenarious here, remote workers and in-memory. AtomicBool would only work for in-memory scenario. To abort remote work jobs non trivial changes of Paladin are needed.

Nashtare commented 1 month ago

AtomicBool would only work for in-memory scenario

Why wouldn't it work in AMQP? proof jobs are programmed to be cancelled and return early an error upon reading a kill signal at some specific locations of the proving flow. I'd assume paladin already supports sending notifications to existing workers, in which case that's all we need for this to work? But happy to stand corrected, in which case we should open a related ticket on paladin and here.

atanmarko commented 1 month ago

I'd assume paladin already supports sending notifications to existing workers, in which case that's all we need for this to work?

I was looking at this topic briefly few months ago, as I recall there is no such mechanism, it is about API changes. When you run the task, you need a new way to pass dynamically the abort signal to the running task.