orbitdb-archive / orbit

A distributed, serverless, peer-to-peer chat application on IPFS
MIT License
1.64k stars 116 forks source link

Timestamp issues #21

Open haadcode opened 8 years ago

haadcode commented 8 years ago

Sometimes users' timestamps differ from each other causing messages to "pop in" to the message history instead of shown at the bottom (latest).

The timestamp can differ only a few seconds or it can differ a few minutes based on users' settings.

The timestamp is generated in Post (ipfs-post) using new Date().getTime().

Before fixing this, I'd like to understand why the discrepancy of a few second to a few minutes. How is that possible? What solutions are there to make all the timestamps the same?

claus commented 8 years ago

This is an interesting problem. In traditional centralized applications, timestamping happens on the server, ensuring that timestamps are consistent. In decentralized applications, there is no central authority that handles those things, timestamping happens on the client, so a timestamp is set to whatever the client's clock is set to, which could be anything.

Maybe request timestamps from connected peers and apply some statistics foo to eliminate outliers? Or something like that?

fiatjaf commented 7 years ago

Easy? This is the hardest problem of the world.

fiatjaf commented 7 years ago

I know zero about orbit architecture, but since this issue needs some action, let me at least make a silly suggestion: why not add the IPFS address of the "current last message" to the message we're about to post? This way the order problem is solved -- no more messages "popping" -- and, although their timestamp could be all messed up, it is possible to be absolutely sure that some message was sent after some other.

deavmi commented 7 years ago

Eliminating outliers might be hard. One peer could still set the timestamp way off. So idk. Try the best solution to what we can come up with here.

claus commented 7 years ago

Time, Clocks, and the Ordering of Events in a Distributed System http://amturing.acm.org/p558-lamport.pdf

Just leaving this here, he lost me somewhere on page 3 :) But maybe it helps? Also, i have to agree with @fiatjaf on the "Easy" label..

haadcode commented 7 years ago

Thanks everyone for the comments. Let me clarify what's going on here:

I opened this issue originally due to the fact that the chat messages in the UI popped in in "weird" order, ie. not according to their ts field in the message. While this is not desired behaviour in the UI, it's actually exactly what should happen given the underlying data structures. ipfs-log is the lowest level data structure and it's what gives us a partially ordered log of database operations that in turn are used by orbit-db. It works essentially how @fiatjaf describes above (link to previous messages). The down side is, the operations (and in turn messages) are not absolutely ordered, causing the "weird" popping of messages.

You're all right to say that this is (one of) the hardest problems in distributed systems and solutions for this vary but one of the canonical solutions is to use Lamport timestamps (or derivatives, like vector clocks). Due to the nature of Merkle Trees, ipfs-log actually replaces Lamport timestamps with Merkle Links which, at the end of the day, gives us the same result: a partial order.

I ended up fixing this on the UI side by just ordering the messages by the ts field in the messages which is a simple fix and comes with the down side that it's dependant on the (local) time of each peer. So if my clock is two seconds behind yours and you send a message right before mine, mine will still appear before yours. This is the "weird" behaviour I refer to and that still exists. Under the hood the message order for both of us is the same (but may not be the actual, chronological order).

As far as I know, there's no solution to have a total order without either a central authority (ie. a server) or a consensus algorithm. We don't want to do the first and for the second there's no satisfying solution (yet, but there's been discussion to implement a consensus algorithm one way or the other as part of IPFS stack, eg. the Stellar protocol).

So, while it's working as expected at the moment, it's not an ideal solution and there's future work to be done to get it right(tm). Then again, perhaps what we have atm is enough and we can live with it.

Will remove the "Easy" label to reflect the extent of the problem. Hope this answers your questions and clarifies the issue. If you have more information or experience on this topic, I would be happy to hear your thoughts and ideas and perhaps you can participate in the discussion and implementation of consensus algorithm in the aforementioned issue.

rsynnest commented 7 years ago

A consensus algorithm would be an interesting solution, that is essentially what NTP uses to approximate time. NTP can also be configured for peer-to-peer syncing, so if the goal is to create a consensus on "what is the current date/time?" I think NTP is a good starting point.

In NTP, each additional "Stratum" (degree of separation from True UTC) adds some inaccuracy, and an IPFS implementation could be considered a software stratum, so I think there will be some loss of accuracy if everything is peer-to-peer. But it might be negligible, and I imagine something similar to NTP could be implemented using existing libp2p tools.

A simple example would be something like https://github.com/enmasseio/timesync, which uses a predefined list of peers and synchronizes time across nodes via WebRTC or WebSockets (The author lays out his algorithm of choice at the bottom of the README). But instead of using a predefined list, you could dynamically query the n closest IPFS peers for their timestamp and sync up, or if you could have all clients in a chat room be "time-peers".

I'm not sure if this should all be built into Orbit or if it would make more sense as its own separate library or libp2p module (Distributed NTP?). If it's implemented well enough it could be a great tool for other IPFS applications to use (ie: logging, time sensitive actions, timestamp metadata on files when they get added to the swarm, etc.).

I might be overthinking this since all you need in this case is a total order for messages, but I figure time consensus is worth discussing and would solve this problem and likely others.

Here are some NTP algorithm details for reference: https://www.eecis.udel.edu/~mills/ntp/html/warp.html#arch http://www.wikiwand.com/en/Network_Time_Protocol

@deavmi Outliers can be excluded in many ways. The timesync package I reference above for example excludes anything above 1 standard deviation from the median (not the mean, which would be affected more strongly by outliers).

whyrusleeping commented 7 years ago

Messages shouldnt need an absolute timestamp, all we really need is a relative ordering and an approximate time (which we're okay with having be slightly inaccurate). The most important thing here is that messages show up in the correct order, for example, my laptops clock is frequently messed up due to me killing the battery, bad ntp responses, and other such funthings. As a result, when i chat with someone, all my messages appear at the bottom, while whoever i'm chatting with appears somewhere a few minutes back in the log.

Using vector clocks, if i send a message with lamport time 2, and then someone responds, their message will have lamport time of at least 3, meaning my client will be able to accurately place that message after my message in the UI. Having this ordering is far more important to the UX than having the 'wall clock' time being accurate.