dao-xyz / peerbit

P2P database framework with encryption, sharding and search
https://peerbit.org
Apache License 2.0
188 stars 14 forks source link

Reevaluate whether Lamport clock is necessary and how to implement physical time #11

Closed marcus-pousette closed 1 year ago

marcus-pousette commented 1 year ago

Since the order of the entries in log could be determined by the "nexts". The lamport clock purpose is ambiguous. Evaluate whether it is necessary to include it at all in the log.

Credit to @tabcat and Opal for sharing this idea.

Main issue with removing the clock on Peerbit as of now is how to order unrelated documents in a document store, if they are not connected to each other in any meaningful way (like through a shared clock). How to we make sure that new entries are submitted with a truthful timestamp?

tabcat commented 1 year ago

Lamport clocks don't add much value IMO. As you said, using nexts they don't provide anymore information. They can also be played with a lot; which shouldn't actually affect ordering that much funnily enough since ordering by nexts always come first. Meaning, whether the clocks are being used correctly or after being poisoned, it should make no difference as long as there are are not completely disjoint and even then the logical clock can be emulated with the merkle one.

I could see in some situations where the logical clock may be more efficient to depend on but I don't think the trade off is worth. It complicates things and opens up area for shenanigans without offering much and less reliable ordering than the merkle clock.

Main issue with removing the clock on Peerbit as of now is how to order unrelated documents in a document store, if they are not connected to each other in any meaningful way (like through a shared clock). How to we make sure that new entries are submitted with a truthful timestamp?

It sounds like in this situation the database is sharded and updates do not link to updates in other shards (makes sense) but does use a share clock. Shards in peerbit sound dynamic so deterministic sharding/partitioning databases doesnt work which is probably why the ordering is needed cross-shard.

It might be worth investigating randomness beacons like drand for either referencing to directly from shard entries or creating a separate beacon log with ordered drand randomness and referencing to that from shards. I havent look at drand too much, and maybe its not needed, but it might provide the ordered randomness needed.

Using a shared logical clock for now doesnt seem that bad either.

marcus-pousette commented 1 year ago

I agree with you. Good reasoning.

It sounds like in this situation the database is sharded and updates do not link to updates in other shards (makes sense) but does use a share clock. Shards in peerbit sound dynamic so deterministic sharding/partitioning databases doesnt work which is probably why the ordering is needed cross-shard.

Yes exactly.. In a sharded database, the clock is even more ambiguous since the clock is defined (ish) by the length of the logs, and in a sharded setup, a database is represented with multiple logs. If you have multiple shards locally, you could sync them yourself, but that does not help with the synchronization across peers. (I had a "exchange clock" routine implemented before so you could have synchronization cross peers, but it did not play out nicely because you are introducing new problems, like what happens if not all peers are connected, what happens if someone provides the wrong time, etc. So I removed it some time ago.

It might be worth investigating randomness beacons like drand for...

Smart! Will check it out.

My gut feeling as now of now is to remove it from the log entry and rather have it on the database level. For example, if you have a social media Post database, you would have a timestamp perhaps on each post. Then, it would be the access controllers job to approve or reject new posts depending on whether new posts have reasonable timestamps (based on previously observed posts and the current time). This would at least let you order things in time somewhat "okay" for real world applications. In practice, no matter if lamport clock would exist or not, you will still need to have a real world clock on posts anyway because a lamport clock can not easily be translated into a normal date, hence, this kind of timestamp checking has to be implemented no matter what.

(A related note, I previously removed "refs" from the log entry and shared "refs" on the "exchange heads" pubsub level instead, so you are not paying for that storage for that kind of links and keeping the door open for more dynamic behaviour and changes. The reduced complexity from that change makes me keen/interested in the clock problem)

tabcat commented 1 year ago

it would be the access controllers job to approve or reject new posts depending on whether new posts have reasonable timestamps (based on previously observed posts and the current time).

I don't usually like the idea of an update being valid or invalid based on predecessors because it is not easy to verify its correctness. It's a similar problem to verifying that the logical clock has been set correctly which requires traversal all the way to the root which isn't possible if entries have been lost.

I wonder what the best option is for handling an incorrectly set timestamp and it's successors. You can either add the successors until finding the entry with the invalid timestamp and then removing them; or only adding entries after verifying the correctness of their clocks.

marcus-pousette commented 1 year ago

I agree with you. You have valid points and have changed my mind. Since yesterday I have been deep diving into the clock issue and actually found that a lot of distributed DBs use a HLC (hybrid logical clock) that keeps track of both physical and logical time in the same clock instance. The implementation seems to be fairly easy. Here is a good article about the subject

I am not sure if you have to traverse to the root to verify the clock for every new entry, I mean since its append only, you only have to verify an entry once. So, if you have previously verified three entries, linked in a chain: 1 <- 2 <- 3, and then entry 4 comes in with next to entry 3. You only have to verify the clock of entry 4 in relation to 3 (and not 3 to 2 again since its has already been verified before)?

However, there is an additional complexity layer when you can not trust peers (that is not present for traditional distributed DBs). And this is mainly a problem for the first entry (I think), of any log since it is not possible to verify it against some previous entries.

Last night I played around with the idea is that you would do a simple Peerbit clock program that is hosted by a cluster of nodes (whom are somewhat blindly trusted) just like a NTP. And the only thing they do is signing clock/time messages as a service so that entries without any nexts can get a somewhat ok physical time in the HLC. Another solution could perhaps to be to have an ACL for the first entry that compares local time and allows some error margin..

marcus-pousette commented 1 year ago

Another thing I did not think about before is that the lamport clock can actually help to easier understand read/query/search responses if you aggregate results from multiple nodes (lamport clocks does provide value when you look at both write and read, not only write)

marcus-pousette commented 1 year ago

Physical time is now implemented with Hybrid logical clock on commits and you can use a ClockService to verify that root (or any) commit have timestamp that is accurate to an error threshold.

For the document store you can now search for documents that have been created or edited on a particular date

    it("modified between", async () => {
                let response: Results<Document> = undefined as any;

                const allDocs = writeStore.docs.store.oplog.values;
                await stores[1].docs.index.query(
                    new DocumentQueryRequest({
                        queries: [
                            new ModifiedAtQuery({
                                modified: [
                                    new U64Compare({
                                        compare: Compare.GreaterOrEqual,
                                        value: allDocs[1].metadata.clock
                                            .timestamp.wallTime,
                                    }),
                                    new U64Compare({
                                        compare: Compare.Less,
                                        value: allDocs[2].metadata.clock
                                            .timestamp.wallTime,
                                    }),
                                ],
                            }),
                        ],
                    }),
                    (r: Results<Document>) => {
                        response = r;
                    },
                    { waitForAmount: 1 }
                );
                expect(
                    response.results.map((x) => x.context.head)
                ).toContainAllValues([allDocs[1].hash]);
            });

Reference