cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.21k stars 3.82k forks source link

stats: previously empty tables don't report job progress for stats collection #50012

Open nvanbenschoten opened 4 years ago

nvanbenschoten commented 4 years ago

The sampleAggregator does not report a fraction completed if the number of rows it expects to find is 0, based on its previous set of table statistics.

See

https://github.com/cockroachdb/cockroach/blob/565ffce1fa582164ac99331691f3d3c80b15c918/pkg/sql/execinfrapb/processors_table_stats.proto#L141-L144

and

https://github.com/cockroachdb/cockroach/blob/565ffce1fa582164ac99331691f3d3c80b15c918/pkg/sql/rowexec/sample_aggregator.go#L216-L224

I just ran into this and was very confused. I expect users will too. The reason why this came off as confusing is because I was running stats immediately after a large IMPORT INTO (which seems quite common). This had the effect of all but the largest table already having a set of non-zero table statistics. However, the largest table, which took the longest to import, did not. So all stats creations completed quickly except for on the last table, which took the longest because it was the largest and never had any progress reported. For a while, I thought the job was stuck. I ended up digging into stacktraces trying to find where it was stuck for a bit until realizing what was going on.

Here's how the jobs page looked about an hour in, after all but the last two tables had completed.

Screen Shot 2020-06-09 at 10 09 20 AM

Eventually, the stats completed, but without ever giving me progress, which I guess is expected based on the code.

We should be able to do something better here. Can we mark the progress as indeterminate instead of leaving it at zero? Or say "unknown remaining". If I saw this as a customer, I would have thought this was a bug and filed a support issue.

cc. @rytaft @awoods187

Jira issue: CRDB-4163

rytaft commented 4 years ago

I agree that this isn't ideal. The jobs infrastructure currently doesn't support reporting an unknown fraction progressed. @spaskob do you think this is something we could support? Do you know how other types of jobs handle this when the fraction progressed is unknown?

spaskob commented 4 years ago

I actually have not worked on the fraction progressed reporting. It may be better to direct this question to bulk-io folks who probably have similar problems with reliably reporting progress on import and backup jobs. Please talk to @dt and/or @pbardea .

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!