changefeedccl: feeds with many ranges can't recover from lag

As observed in the drt-scale test, changefeeds which watch many ranges are unable to catch up from lag and fall further and further behind. This is largely related to the catchup scan semaphore which protects crdb from expensive catchup scans.

Some of the feeds' ranges will catch up, and most will be blocked on that semaphore. Rangefeed restarts due to transient issues / slow consumer are likely, and since we don't emit any checkpoints during catchup, that progress is lost.

This situation makes changefeeds incredibly unstable, as when they restart due to transient errors, they cannot recover.

Related issues/PRs:

per consumer catchup: https://github.com/cockroachdb/cockroach/pull/133789 - this change makes feeds catch up one at a time, which may improve matters somewhat
use pebble snapshots instead of iterators for catchup scans - https://github.com/cockroachdb/cockroach/issues/133851 - this will make catchup scans cheaper so we can do more of them
other miscellaneous scan perf issues tracker - https://github.com/cockroachdb/cockroach/issues/133815

Jira issue: CRDB-44440

cockroachdb / cockroach

changefeedccl: feeds with many ranges can't recover from lag #135294