Closed mangalaman93 closed 5 years ago
Hi Aman, I'd like to take this up. I've been thinking of the implementation.
Flushing after a fixed number of enqueues is straightforward and can be done on the enqueue code path itself calling flush over all arenas.
However for the other flush strategy, since it'd involve a concurrent time.Ticker
I think the concern is whether sys_msync
call is thread safe w.r.t to the single concurrent writer.
The man page says nothing along those lines. Do you know if it is thread safe?
If it is, we could even fire go routines for all arenas to flush concurrently as well.
Thanks
sounds good. For now, given bigqueue in not thread safe, it would be okay to use the same Enqueue (may be Dequeue as well) to check for timer completion event.
We might want to keep a dirty flag for each arena to identify which ones to flush. Doing them concurrently might be fine too, though, I wonder whether the cost of creating go routines would be more than calling the flush syscall.
Nice idea. Although msync would only sync the changes, having an in-app dirty flag would even avoid the system call.
You're right, if the go scheduler creates an OS thread for each of those blocking flushes then we might want to pool them. I'll write a benchmark and see, but let's keep this issue simple by doing everything on the enqueue/dequeue code path.
I'll get a PR ready.
Another Reference: https://github.com/dgraph-io/badger/issues/526
We should expose the flush function as part of the bigqueue interface. Additionally, we should not trust the OS periodic syncing, and instead, enable flushing periodically, with a timer or probably by amount of data change, with configuration parameters to choose the period.