nats-io / nats-architecture-and-design

Architecture and Design Docs
Apache License 2.0
170 stars 20 forks source link

Support for Headers on KeyValue storage #237

Open davidmcote opened 8 months ago

davidmcote commented 8 months ago

What motivated this proposal?

Message headers encoded separately from a binary-blob payload are extremely handy capability of core NATS and Jetstream.

The lower level library bindings for Jetstream make headers readily accessible for application use, but the KeyValue abstraction does not expose access to message headers.

Since KV storage is just another stream, there doesn't appear to be a technical limitation to enabling headers. There simply isn't API bindings for it.

What is the proposed change?

  1. In https://github.com/nats-io/nats.java/blob/main/src/main/java/io/nats/client/KeyValue.java#L24, overload create/update/put methods with versions that accept a Headers object.
  2. In https://github.com/nats-io/nats.java/blob/main/src/main/java/io/nats/client/api/KeyValueEntry.java#L28, introduce a Headers datamember with accessors. Populate it in the constructor from Message.getHeaders().

Who benefits from this change?

Developers are already using Jetstream to store messages with headers. Some would like to do the same with the KeyValue abstraction.

What alternatives have you evaluated?

One may leverage lower level APIs to store KeyValue entries with headers. This is demonstrated below with the natscli. It unfortunately pierces through the KeyValue abstraction and requires the application developer to string munge stream names and subject patterns to access data in the lower level stream.

Demonstration with natscli:

Create a bucket and store KVs, one with a custom header:

$ nats kv add mybucket

$ nats kv put mybucket foo bar
bar

$ nats req '$KV.mybucket.biz' -H myheaderkey:myheadervalue baz
10:13:40 Sending request on "$KV.mybucket.biz"
10:13:40 Received with rtt 432.206µs
{"stream":"KV_mybucket", "seq":2}

Inspect contents of KV with KV CLI. Headers omitted.

$ nats kv ls mybucket
foo
biz

$ nats kv get mybucket foo
mybucket > foo created @ 24 Aug 23 14:11 UTC

bar

$ nats kv get mybucket biz
mybucket > biz created @ 24 Aug 23 14:13 UTC

baz

Inspect contents of underlying stream to reveal headers.

$ nats stream view KV_mybucket
[1] Subject: $KV.mybucket.foo Received: 2023-08-24T10:11:59-04:00

bar

[2] Subject: $KV.mybucket.biz Received: 2023-08-24T10:13:40-04:00

  myheaderkey: myheadervalue

baz

$ nats stream get  KV_mybucket -S '$KV.mybucket.foo'
Item: KV_mybucket#1 received 2023-08-24 14:11:59.009016 +0000 UTC on Subject $KV.mybucket.foo

bar

$ nats stream get  KV_mybucket -S '$KV.mybucket.biz'
Item: KV_mybucket#2 received 2023-08-24 14:13:40.951116 +0000 UTC on Subject $KV.mybucket.biz

Headers:
  myheaderkey: myheadervalue

baz
scottf commented 8 months ago

@davidmcote This is an architectural issue spanning the server and all clients. I am moving this request to the appropriate repo.

ripienaar commented 8 months ago

You are saying that it would benefit - but do not describe exactly what the benefit is.

A KV is exactly a 2 part item - key and value. It's not 3 part - KVH. We limit the functionality to match other KVs to keep things simple, understandable and to keep avenues open for replacing the backend later with another.

So for headers people usually store a serialized hash/map/struct where one of the keys are the metadata.

If you can expand why that is not a suitable choice for you we can consider it better.

davidmcote commented 8 months ago

Thanks for the reply!

The proposal of this ticket is a backward compatible change to the KeyValue API which would allow applications to opt-in to using Headers. Existing application code would continue to use KV in its current form.

An application developer choosing to depend on the new capability would certainly have a harder time migrating to another backend, but whether to limit their use of this feature does seem like it would be their decision to make.


In my specific use case, I am using a serialization library to store and load structures as values in a key value store. My structures happen to be polymorphic.

While serialization of a polymorphic type is no problem, deserialization generally requires type information as input to map back to a class. My serialization library does not encode type information inline and I have limited control over the serialization library. This motivates supplementing the byte[] payload value somehow to communicate a runtime type/class to decode into.

In lieu of this proposal, I'm constrained to either:

  1. Switch to a serialization format that encodes and decodes type information inline.
  2. Develop an envelope structure to encapsulate the real serialized value alongside type information.
  3. Communicate type information in Headers, but forego the KeyValue abstraction and re-implement much of the desired KeyValue behavior on top of regular streams.

Only the solution with Headers appears to avoid changing the encoding scheme of byte[] values.

ripienaar commented 8 months ago

The concern about backends is our ability to move to other backends with out user impacting changes.

To my mind choice 2 seems most compatible with a KV store. Or perhaps pre/postpend a type hint to the serialized data without the whole overhead of a wrapping structure and multiple unmarshals.

The KV implementation is popular specifically because it removes features and complexity from streams. Really reluctant to feature creap into it being a stream.

davidmcote commented 8 months ago

Thank you. I appreciate you taking the time to consider this proposal.