When upgrading a job, there was no option to take a savepoint before upgrading (and using it to restore)
added a flag to fix this case
When a cluster starts it tries to take a savepoint, the savepoint status only updates once it completes, this creates a situation where a new savepoint gets triggered while the previous one is still running, and it keeps happening if your savepoints won't finish quickly (forever)
I solved this with another value that holds the savepoint trigger time, and an increased savepoint timeout, so while a savepoint is still running a new one will not get triggered.
@functicons I'd love to get your feedback on this, we might want to create a stronger solution later on but for now savepoints are impossible to use with this operator if they take some time.
I found 2 problems related to savepoints:
When upgrading a job, there was no option to take a savepoint before upgrading (and using it to restore) added a flag to fix this case
When a cluster starts it tries to take a savepoint, the savepoint status only updates once it completes, this creates a situation where a new savepoint gets triggered while the previous one is still running, and it keeps happening if your savepoints won't finish quickly (forever) I solved this with another value that holds the savepoint trigger time, and an increased savepoint timeout, so while a savepoint is still running a new one will not get triggered.
@functicons I'd love to get your feedback on this, we might want to create a stronger solution later on but for now savepoints are impossible to use with this operator if they take some time.