Open miretskiy opened 2 years ago
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
Currently, job scheduler exposes txn to job schedule implementation. While the intent was to enable atomic schedule execution together with schedule record update, exposing txn directly is dangerous. The implementation may use the transaction to perform wide reads, which then could result in schedule txn being retried indefinitely, as show in https://github.com/cockroachdb/cockroach/issues/78465
In general, job scheduler should avoid locking records, should avoid using for update clause, and should not expose low level primitives (txn) to job implementations. It should assume that job implementations are buggy, and must ensure that an error in one of the job implementations does not destabilize the rest of the system (show jobs, show schedules, etc).
Jira issue: CRDB-14144