onemoredata / bagger

Massive log storage in PostgreSQL
BSD 2-Clause "Simplified" License
12 stars 1 forks source link

Consider writing our own etcd driver #93

Open einhverfr opened 11 months ago

einhverfr commented 11 months ago

Net::Etcd supports v3 of the interface, but does so in a way which makes it impossible to send data from inside an AnyEvent callback. The Etcd module does not support watches. The problem son both sides are baked into their architecture and I don't see either side fixing it.

As a result there are not drivers for Ercd which work well in an anyevent setup. This is probably not too hard to write based on existing modules out there.

Right now there are a few things I am not sure about in the current implementation. And a fully asynchronous etcd driver would make this easier to manage,

einhverfr commented 11 months ago

I am having some second thoughts here which I think might necessitate a different longer-term solution.

What I am worried about is that if everything is asynchronous then out of ordered execution becomes at least theoretically possible. If one etcd message fails to go through for some reason, then rather than retrying, , we might end up processing subsequent messages and this might lead to lost data.

While in theory it is possible a newer version of a row could get overwritten by an older one (also something we need to avoid) it is much more likely that a guard to a newer row could be destroyed before an older one fails, causing the LSN to skip forward over a missing row.

What I would prefer to do is to instead have a loop-and-queue system where both sides are handled via callbacks and the guard is passed back and forth (possibly as part of an object). This would allow a certain number of retries and a repeated failure to be handled possibly by stopping replication and not advancing hte LSN pointer forward.