dolthub / dolt

Dolt – Git for Data
Apache License 2.0
17.82k stars 505 forks source link

dolt does not reload all statistics on server restart #8345

Open timsehn opened 3 weeks ago

timsehn commented 3 weeks ago

Before server restart:

$ dolt sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit. "\help" for help.
media_wiki/main*> select count(*) from dolt_statistics;
+----------+
| count(*) |
+----------+
| 24967    |
+----------+
1 row in set (0.01 sec)

After server restart:

media_wiki/main> select count(*) from dolt_statistics;
+----------+
| count(*) |
+----------+
| 1559     |
+----------+
1 row in set (0.00 sec)

I had to kill the server because it seemd to hang but I had statistics off.

timsehn commented 3 weeks ago

Then once I restart stats collection I get this:

media_wiki/main> call dolt_stats_restart();
+----------------------------------------------+
| message                                      |
+----------------------------------------------+
| restarted stats collection: refs/statistics/ |
+----------------------------------------------+
1 row in set (0.00 sec)

media_wiki/main> select count(*) from dolt_statistics;
+----------+
| count(*) |
+----------+
| 0        |
+----------+
1 row in set (0.00 sec)
max-hoffman commented 3 weeks ago

A race between concurrent ANALYZE/background thread update could explain dropped statistics. But there is a lot going on here that makes it difficult to understand. Some things that would be helpful are (1) errors in debug logs on startup, (2) zip of the statistics database that fails to load fully. Restarting going to zero doesn't make sense to me yet, a thread can only lock one table at a time, it should be hard for that race to clear the whole database.

max-hoffman commented 3 weeks ago

Another thing -- in order to avoid stats failures preventing server startup, we log context warnings on error. If stats do not load SHOW WARNINGS might have clues.