NearNodeFlash / NearNodeFlash.github.io

View this document https://nearnodeflash.github.io/
Apache License 2.0
3 stars 3 forks source link

Distinguishing fatal errors from transient errors #46

Closed jameshcorbett closed 9 months ago

jameshcorbett commented 1 year ago

There needs to be a way for Flux to distinguish fatal errors from transient errors. When a fatal error occurs, Flux should be able to report back to the users immediately that their jobs hit a fatal error. Users will want to know the difference between "something took too long so we gave up" and "we hit a hard error, you [the user] may have done something wrong."

matthew-richerson commented 9 months ago

This change was included in nnf-deploy release v0.0.4