nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.5k stars 1.38k forks source link

PublishAck in JetStream should include timestamp #3890

Open kakserpom opened 1 year ago

kakserpom commented 1 year ago

Feature Request

PublishAck should include timestampNano to go along with a sequence number.

Use Case:

I am using JetStream as a sophisticated commit log and I rely on timestamps of JetStream messages. Currently I have to call getMessage() to fetch a newly published message in order to retrieve its timestamp, which is not ideal.

Thank you 🙏🏻

derekcollison commented 1 year ago

What do you use the timestamp for?

kakserpom commented 1 year ago

@derekcollison In my stateful ordered consumer I save the timestamp in processed_at field of an order (processed_at = time when an order was committed for the first time). A bit of context: I have a stream of incoming orders which is consumed by a stateful consumer. Each order gets ingested and then it is able to mutate any number of times. The consumer reads a batch of orders, processes them and publishes a single commit message. The consumer of commit messages fills processed_at (if it's zero) with message.timestampNanos before sending it to an OLAP database (an INSERT-only replacing storage, versioned by timestampNanos). Whenever an order changes after the initial processing, a new order snapshot gets published as a part of another commit message. Then processed_at has to be set to the time of the initial commit, otherwise it would get updated every time. I've deliberately made it so the stateful order consumer sends the entire snapshot, because then the commit log consumer can just send data to OLAP rapidly without reading anything. I hope this makes sense.

derekcollison commented 1 year ago

Could you just use the sequence number assigned to the first message for an order, and use that instead? Same logic as above but with stream sequence number for the first message of an order.

bruth commented 1 year ago

@kakserpom As @derekcollison, the sequence number would be a more standard indicator of the version of a state change. However, is there a need to utilize the message timestamp as a rough correlation for other activities utilizing that state?

john-bagatta commented 1 month ago

Our use case would benefit from this as well - we use jetstream as a first-tier persistence layer for client-created data, and our clients use the timestamp both as stored informational data and as a query parameter to catch up on data they missed while offline. Sequence ID would work for the query, of course, but since the data model stores the timestamp regardless we'd strongly prefer to keep the data lean and not persist a second parameter just for that.

Right now as a workaround, publishing clients subscribe to the published subjects and ack the data to the client only when receiving the message in the subscription (at which point it does have the timestamp), but we could disable that (and set noEcho) if we could pull the time off the pub ack.

jnmoyne commented 1 month ago

One issue with the timestamps that you don't have with the sequence numbers is that you are not guaranteed that the timestamps increase monotonously while you have such a guarantee with sequence numbers.

Timestamps "going back in time" could happen if the stream leadership moves from a server to another and those two servers are not clock synchronized enough and where the new server's clock could be a bit behind that of the old server.

kakserpom commented 1 month ago

@jnmoyne That's expected. Not a problem.