Support for Jetstream in NATS output plugin

neelayu commented 1 year ago

Use Case

NATS Jetstream supports persistence of messages and currently telegraf only supports publishing messages to NATS core. It can read from Jetstream via nats_consumer input plugin, however publishing is missing.

Expected behavior

Allow configuration for Jetstream in NATS output plugin and publish metrics to it.

Actual behavior

Not supported

Additional info

No response

powersj commented 1 year ago

Thank you for the issue and the PR!

next steps: review PR

neelayu commented 11 months ago

@powersj I need your inputs. I was going through the behaviour of NATS and found something interesting. Recently the Jetstream API has gone through a lot of changes(almost rewritten) and actually it makes things a lot easier to understand.

If we want to publish any message to NATS, we require a subject and the message itself. This is valid for both NATS core and Jetstream. Jetstream essentially maps a subject to a stream(persistent store). So technically, telegraf can post messages to a stream if the subject provided in the config is a part of a stream and that the stream exists. This is done on the server side, so the clients don't have to worry about it.

We need to define the behaviour of telegraf now-

If the stream already exists and has the relevant subject in the telegraf conf, those messages will end up in the stream. In this case we need not do anything. I can close the PR 😄
If the stream exists, but user wants to publish to a subject which is not part of the stream, in that case telegraf can add this subject to the stream.
If the stream doesn't exist, user wants to rely on telegraf creating the stream and provides the stream config in telegraf toml. There is a chance the subject(outputs.nats) and Subjects(jetstream which is an array) do not match. Telegraf will use subject to publish data, but may never end up in the stream.

Essentially the problem is not about publish, but about configuring the stream. Because even without jetstream, the NATS API supports publishing to a stream.

Let me know your thoughts.

powersj commented 11 months ago

Hi,

Thanks for the update and scenarios.

If the stream already exists and has the relevant subject in the telegraf conf, those messages will end up in the stream. In this case we need not do anything. I can close the PR

excellent, so really the remaining concern is when the server is not set up with a stream and/or subject?

If the stream exists, but user wants to publish to a subject which is not part of the stream, in that case telegraf can add this subject to the stream.

I think Telegraf should create the stream. We could try to do this once per start up, where we check if the subject exists and if not create it. Is that possible?

If the stream doesn't exist, user wants to rely on telegraf creating the stream and provides the stream config in telegraf toml.

I think this should also be an option in order for us to claim support for streams as well.

neelayu commented 11 months ago

I think Telegraf should create the stream. We could try to do this once per start up, where we check if the subject exists and if not create it. Is that possible?

Subject to stream is a 1-1 mapping, while stream to subjects is n-1 mapping. We can check if the subject exists via

func StreamNameBySubject(ctx context.Context, subject string) (string, error)

This will return the stream name for the subject if it exists, or error if no such stream exists or jetstream is not enabled on the server. We have the following scenarios now-

If stream name is returned and jetstream config is not provided, we can log the info stating that messages will be published to the stream.
If stream name is returned and jetstream config is provided, we may ignore the config or try to use CreateOrUpdateStream to update the stream with the given params. Exit telegraf on error or log a warning?. Logging a warning might be better since the same config file may be use many times to start the telegraf.
If the stream doesn't exist and jetstream config is not provided, this would be the current behaviour.
If the stream doesn't exist and jetstream config is provided, we will Create the stream using the given config.

That leaves us with the important question- If we are creating the stream, should we consider the Subjects present in the config and append the subject from the nats(since we used this to check for stream) or simply overwrite the Subjects field with this subject.

PS: Subjects is an array and is optional. If not provided it will create one using the stream name itself which is mandatory!

powersj commented 11 months ago

Let me turn the tables around and ask you as a user of Streams what would you prefer? A more explicit approach where what we have in telegraf is what we do, or do we always try to see what the server has?

The more scenarios you provide, the more concerned I become with us trying to do things for the user, and lean more towards an explicit approach. Where we do whatever the telegraf config says. If the stream doesn't exist, we create it and in all other cases we just do what the user told us to do.

That config could include reading a stream value from a tag for example, or be explicitly defined in the config.

Thoughts?

powersj commented 10 months ago

@neelayu - was there additional work you wanted to do then to support jetstream in nats to cover the additional cases? Or is what is present sufficient?

neelayu commented 10 months ago

Thanks for your response. I think I covered everything in my last commit. Let me know your thoughts. Also apologies for the delay!

influxdata / telegraf