Granola-Team / mina-indexer

The Mina Indexer is a re-imagined version of the software collectively called the "Mina archive node."
Apache License 2.0
16 stars 10 forks source link

Problem: unnecessary log recovery takes a long time #1244

Closed robinbb closed 2 days ago

robinbb commented 4 days ago

When many database .log files exist at the time of the shutdown, then the subsequent invocation of the indexer must recover the logs, resulting in a long startup time. The database LOG file fills with thousands of entries like so:

2024/07/03-14:44:04.550357 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #903 mode 2
2024/07/03-14:44:05.841128 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #905 mode 2
2024/07/03-14:44:06.430258 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #910 mode 2
2024/07/03-14:44:07.087813 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #912 mode 2
2024/07/03-14:44:08.302963 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #914 mode 2
2024/07/03-14:44:08.485113 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #916 mode 2
2024/07/03-14:44:08.526312 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #918 mode 2
2024/07/03-14:44:09.888130 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #920 mode 2
2024/07/03-14:44:10.545513 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #925 mode 2
2024/07/03-14:44:11.086972 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #927 mode 2
2024/07/03-14:44:11.098326 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #928 mode 2
2024/07/03-14:44:12.169030 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #931 mode 2
2024/07/03-14:44:12.711491 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #933 mode 2
2024/07/03-14:44:12.721208 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #934 mode 2
2024/07/03-14:44:12.806902 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #937 mode 2
2024/07/03-14:44:14.646108 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #942 mode 2
2024/07/03-14:44:14.940052 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #944 mode 2
2024/07/03-14:44:15.594004 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #950 mode 2
2024/07/03-14:44:15.839295 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #952 mode 2
2024/07/03-14:44:17.129986 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #956 mode 2
2024/07/03-14:44:17.140717 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #957 mode 2
2024/07/03-14:44:18.504714 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #960 mode 2
2024/07/03-14:44:19.026483 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #962 mode 2
2024/07/03-14:44:19.338331 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #964 mode 2
2024/07/03-14:44:20.406461 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #966 mode 2
2024/07/03-14:44:21.297247 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #968 mode 2
2024/07/03-14:44:21.384424 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #971 mode 2
2024/07/03-14:44:21.485698 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #974 mode 2
2024/07/03-14:44:23.730757 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #977 mode 2
2024/07/03-14:44:24.079230 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #981 mode 2
2024/07/03-14:44:24.274005 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #985 mode 2
2024/07/03-14:44:24.419353 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #988 mode 2
2024/07/03-14:44:25.184003 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #991 mode 2
2024/07/03-14:44:25.700564 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #993 mode 2
2024/07/03-14:44:25.876668 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #995 mode 2
2024/07/03-14:44:26.451839 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #997 mode 2
2024/07/03-14:44:26.453731 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #998 mode 2
2024/07/03-14:44:27.051074 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #1001 mode 2
2024/07/03-14:44:27.799900 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #1003 mode 2
2024/07/03-14:44:29.439441 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #1005 mode 2
2024/07/03-14:44:29.636125 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #1007 mode 2
2024/07/03-14:44:29.831803 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #1011 mode 2
2024/07/03-14:44:29.956207 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #1013 mode 2
2024/07/03-14:44:32.045305 1735278 [db/db_impl/db_impl_open.cc:1041] Recovering log #1017 mode 2
robinbb commented 4 days ago

Probably related to lack of database flush. See #1243

robinbb commented 4 days ago
2024-07-03T14:40:51.431041+00:00 - INFO Using default mainnet genesis ledger
2024-07-03T14:40:51.433502+00:00 - DEBUG Creating a new IndexerStore in /mnt/mina-indexer-prod/db/0.7.0-340000
2024-07-03T15:24:46.008149+00:00 - DEBUG INDEXER_VERSION file exists. Checking for compatability
2024-07-03T15:24:46.010012+00:00 - INFO Starting the web server listening on 0.0.0.0:8080
2024-07-03T15:24:46.010107+00:00 - INFO Starting mina-indexer server

The above shows the duration of the log recovery between lines 3 and 4 - it is 44 minutes.

robinbb commented 2 days ago

This is solved sufficiently well that it should be de-prioritized. (The snapshot-after-ingestion fix helped.)

robinbb commented 2 days ago

Actually, I think it better to actually close this issue, and simply re-open if the problem re-emerges.