filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.82k stars 1.25k forks source link

Make `synchronous` SQLite setting configurable #12082

Open masih opened 3 months ago

masih commented 3 months ago

The current SQLite configuration in Lotus explicitly sets the synchronous setting to NORMAL.

https://github.com/filecoin-project/lotus/blob/6bc04c8060d59099f5d4eadf7b1ad716f8799b33/chain/events/filter/index.go#L25

https://github.com/filecoin-project/lotus/blob/395cdd721e7ead60e4b7853b3023f50daeed697f/chain/ethhashlookup/eth_transaction_hash_lookup.go#L18

https://github.com/filecoin-project/lotus/blob/b6a77dfafcf0110e95840fca15a775ed663836d8/chain/index/msgindex.go#L42

While this setting provides a balanced trade-off between data safety and performance, it does not cater to all use cases. Specifically, high-traffic APIs, e.g. ethhashlookup, might require the ability to adjust this setting to optimise performance under heavy load conditions.

Make synchronous setting configurable through the application settings, allowing it to be easily adjusted based on specific requirements. The default setting should remain NORMAL to ensure data integrity for the majority of lightweight use cases.

In a case where high throughput is required (i.e. synchronous is set to off) WAL checkpointing can be used to mitigate the risk of potential data loss at the face of ungraceful shutdown (along with some strategic flushing which would need some code changes).

For now:

BigLep commented 2 months ago

2024-06-18 triage note: @rvagg is this getting handled as part of https://github.com/filecoin-project/lotus/pull/12098 ?

rvagg commented 2 months ago

no, tbh I'm not strongly convinced that this is all that useful

https://sqlite.org/pragma.html#pragma_synchronous

we're using WAL for all 3 databases, and in normal operation, we bypass sync operations on the main database, only synchronizing the WAL

behaviour of "off" with WAL isn't clearly defined by the docs but may mean that WAL checkpoints become unsafe, which just may not be worth it for relatively compact WAL files

masih commented 2 months ago

no, tbh I'm not strongly convinced that this is all that useful

https://sqlite.org/pragma.html#pragma_synchronous

we're using WAL for all 3 databases, and in normal operation, we bypass sync operations on the main database, only synchronizing the WAL

behaviour of "off" with WAL isn't clearly defined by the docs but may mean that WAL checkpoints become unsafe, which just may not be worth it for relatively compact WAL files

That makes sense for default behaviour. Does make sense for every client out there?

My sense would be to parameterise it.

rvagg commented 2 months ago

Maybe, it could become caveat emptor in the soup of config variables that need documenting and explaining. But my point is that I don't see this as a big enough value-add to bump up as a priority task, maybe a P3 until someone comes asking it—or better yet, comes and contributes this..

masih commented 2 months ago

Considering the small size of this task my recommendation would be to run a quick benchmark of eth API throughput with sync off, then prioritise.

Maybe something @snissn can measure?