Open erikgrinaker opened 2 years ago
@rafiss this is another opportunity to potentially leverage TTL code.
Looks like we already have 30 days of retention, which should be plenty:
The problem is that the debug.zip tends to time out when collecting system.rangelog.txt
. And the rangelog.json
file is limited to 1000 events:
I think the simplest solution might be to extend the timeout for system.rangelog.txt
such that we can dump the entire thing. rangelog.json
will be more work in that we'd have to add pagination and such. If size becomes an issue, exporting the last 7 days (rather than 30 days) would likely be sufficient.
In support escalations, I usually find that the rangelog contained in the debug.zip has already rotated out the interesting bits. We should reconsider the default retention policy here, since this can be very useful for debugging.
Jira issue: CRDB-20495
Epic CRDB-32134