Proposal: Support Deletion of a Channel

kozlovic commented 7 years ago

Right now, it is impossible to delete a channel.

For FileStore, one can stop the server and delete the sub directory corresponding to the channel, then restart the server. But this causes downtime.

A proposal would be to allow deletion of a channel with following restrictions:

The deletion of the channel would be done through the monitoring endpoint, such as https://localhost:8222/channelsz?channel=foo&delete=1
It would be restricted to HTTPS port.
Request would be rejected by the server if there are existing subscriptions attached to the channel.

dahankzter commented 7 years ago

Perhaps it can be GC'd automatically? What purpose does an unused channel serve?

kozlovic commented 7 years ago

What is an unused channel? Are you implying that it is only when a MaxAge is set and all messages have expired?

kozlovic commented 7 years ago

.. and there is no subscriptions attached

deem0n commented 7 years ago

I would suggest that requesting https://localhost:8222/channelsz?channel=foo&delete=1 will

Delete channel immediately If there is no subscriptions
Mark channel foo for later deletion if there are active subscriptions AND disallow new subscriptions. Garbage collector will try to remove channels marked for deletion periodically (configurable how often?) still checking if there are any subscriptions on channel in question.

If that is hard to implement I will be happy with original proposition by @kozlovic

deem0n commented 7 years ago

Also DELETE method would be more RESTy ;-) DELETE https://localhost:8222/channels?channel=foo

OR even

DELETE https://localhost:8222/channels/foo

kozlovic commented 7 years ago

I was concerned because we have an existing channelsz endpoint, from which we can use params (?limit=100&offset=10) for both all channels and also getting the subscriptions for a given channel. But I can see adding a handler for channelsz/ route and check method. From that route, only channel name will be expected (no channel means all). But we would keep the original one for when one wants to use pagination, etc...?

deem0n commented 7 years ago

I think separating endpoints for subscriptions and channels would me more natural and more flexible for future API extensions. So

/channelsz/... - should remain as now /channels/... - should operate on channels: list all channels with GET, delete channel with DELETE etc.

But API endpoints in other projects usually follow REST style, like this:

GET /channels/foo/subscriptions - list all subs on channel foo GET /subscriptions -list all subscriptions on the server if we need such info? GET /channels/foo/subscriptions?limit=100 - use options, Ok GET /channels/?limit=100 - still use options, fine. Just see 100 channels DELETE /channels/foo/subscriptions/GUID - kill subscription with GUID on channel foo POST /channels/foo/subscriptions - subscribe with message sent in the BODY ;-) DELETE /channels/foo - delete channel foo

Probably we can have that and old /channelsz for backward compatibility ?

dahankzter commented 7 years ago

Isn't the regular nats subscription/subject very analogous? Why is there a need to limit the number of channels? It is not directly related to this issue but I wanted to raise the usage a little.

The use case in the slack was about "I have reached my configured max channels so I want to delete some channels". That is a rather artificial problem since the server was configured with a clearly insufficient number of channels. Solution is simple, just raise the max.

Another case might be that disk is actually running low and we might want to purge some data in a controlled manner by deleting a few channels. This is also a little bit weird because unless the involved apps don't need the channels they will just be recreated.

Why I am asking is that I fear that a proliferation of management features will make the system hard to use. If the system can be designed in such a way that it just protects itself and performs the natural thing like cleaning up things not used (that can be automatically materialised when clients need them) it would make using it all that much easier.

I get that not everyone wants the same but a simple resilient and superfast eventlog that self tunes and heals as needed was what I had in mind when I first saw streaming.

deem0n commented 7 years ago

@dahankzter please note, that it is nats streaming, so channel without subscriptions still keeps all messages for the case when new subscriber connects and replays the log of all messages from the start. This is the difference from the vanilla NATS. So we NEED channel without subscriptions to persist and server can not detect what channels without current subscriptions will never be used and what channels without subscriptions will be connected soon. So manual process is required to clean unused channels.

dahankzter commented 7 years ago

Ok fair enough I agree that it can be useful to be able to delete channels manually.

But all messages are not kept forever right? If the disk flows over or something similar perhaps max msgs setting. Messages gets purged and perhaps channel drop isn't that different.

However could there not be some heuristics as well for automatically cleaning? Some channel inactivity TTL perhaps?

I think I will just never set a max channel count and rely on other methods purge data. Perhaps it will turn out that manual removal of channels is the best way.

Still I fear the "death by configuration" situation.

kozlovic commented 7 years ago

@dahankzter That was exactly the reason why deletion was not "supported" in the first place. Because as you said, it did seem to us that channels had a purpose from an user perspective and there should be no need to remove them (at least in a prod env). In dev, you can end up creating channels that you don't need, but then it's not a problem, stop server, delete sub directory and you are done.

My fear of providing ability to delete channels is that people use channels as an "inbox" in a request/reply scenario, which we strongly oppose. In that case, the channel has only few messages and users that misuse streaming then want these channels to be deleted.

Your fear of proliferation of management features is shared. As you clearly stated, if one hits the limit of channels, then it means that the limit is too low, and it's just a mater of raising it. If it hits the limit because of small unwanted channels, it is probably not the best use of streaming?

That being said, I am sure that there are valid use cases where it would make sense.

A note about your comment:

But all messages are not kept forever right? If the disk flows over or something similar perhaps max msgs setting. Messages gets purged and perhaps channel drop isn't that different.

As I explained previously (maybe it was on Slack), max messages or max bytes will simply cause the server to drop oldest messages in the log to make room for the new one(s), but there would always be at least 1 message in the log. Only with max age can you get in a situation where all messages are "removed" from the log. (I write "remove" because for filestore implementation, they may still be present on disk but would not be recovered due to the age limit - assuming the max age is not changed on restart ;-) ).

So I am not yet getting a clear picture of how the server would decide when a channel can be automatically removed.

I get that it would not be able to delete if there are subscriptions (unless we were to add a way to notify applications that the subscription is no longer valid, which we don't have at the moment - a protocol change would be required).

As @deem0n suggested, no subscription is not a valid reason. Messages can be logged without any subscription since those can be started later and replay any message they want.

So is it only when max age is set on a given channel and that all messages have expired?

@deem0n I believe that this would not really work for your use case since you said you cannot predict how long a "chat room" would be in use. So what would be the trigger for automatic deletion of a channel? Or does it need to be done by the user only?

I think that @dahankzter did not see a use case for actually deleting channels, so I guess he would be fine without automatic delete and @deem0n you had a case where you need manual delete, so what gives?

dahankzter commented 7 years ago

If @deem0n has a good use case then I don't want to argue on principal but for a product like streaming maintaining a coherent vision and idea is important and perhaps deleting channels is not opposed to this.

It is perhaps no stranger than Kafka allowing for deleting a topic but not making it easy to do it? At least last I checked you needed to run a command on the server.

In any case manual deletion is easier than a GC strategy so why not start with that?

deem0n commented 7 years ago

Thanks for interesting discussion!

I can see another way to handle automatic channel deletion just with config setting like maxChannelInactiveAge. It may be set per server and per channel and should work mostly like MaxAge for messages.

If channel

is inactive for maxChannelInactiveAge period of time, i.e. has no messages during maxChannelInactiveAge (counting from time now backwards)
has no currently connected subscribers
has no currently connected publishers

then the channel is immediately removed from the system.

If some slow publisher will want to publish something, then the channel will just be created again with the same name.

I think this behavior will suite everyone here.

ColinSullivan1 commented 7 years ago

The general use case is maintenance related; cleanup for NATS streaming servers without downtime.

Here are a few thoughts:

If monitoring endpoints would provide the last activity of a channel and there is an administrative interface to delete a channel, a tool could be developed to find stale channels and then invoke the interface to delete them. This is simplest in NATS, most flexible long term (just add to the tool, create a web app, expand monitoring information, etc), but requires some additional work outside of the streaming server.
The other approach is to build this into the server itself, with parameters that dictate when channels would be deleted. This is most convenient (no external tools required), but one size likely won't fit all, and this would result in an expansion of parameters, reducing simplicity of server usage ("death by configuration" - love that).

IMO the first approach is more aligned with the NATS tenet of simplicity, and I prefer the original proposal, understanding that additional information can be provided in the monitoring endpoints to allow for more comprehensive tooling around this feature. Expanding on this idea, I do think it'd be a good idea to provide an option to archive the deleted channel, where the slice archive script would be invoked.

dahankzter commented 7 years ago

As a maintenance feature it makes good sense. I think my beef is more with the max-channels setting but if 0 means no limit then it is not a forced hard limit.

A set of endpoint or even nats subscriptions would work nicely. The thing I worry a litte about is how polite these operations should be. If it tries to be too nice then there will inevitably come situations where clients misbehave and effectively prohibits the administrative action.

I vote for the DELETE operation to be pretty forceful.

soulne4ny commented 7 years ago

The other approach is to build this into the server itself, ...

I would vote for this approach. First, in order to delete channel, one should have privileges. It would be easier to archive as authorisation is already there. Second, with clustering, most likely, a channel deletion should apply to all of cluster, not just one instance that exposes monitoring. Therefore changes in the protocol are unavoidable.

Monitoring http endpoints, on other hand, are tailored to monitoring itself. Mixing monitoring with other administrative functions, while seems straightforward, is quite far from separation of concerns approach. It would produce much more monitoring code that is far from core functionality.

I would refer to SQL as an example. All the creates, alters, deletes and grants with revokes are in the language itself.

notbdu commented 6 years ago

I have a use case where we have high throughput subscriptions where each unique subscription is hashed to a topic (resulting in a unique STAN channel). This isn't request reply as subscriptions fan out to multiple subscribers and the subscribers can come up and down.

We also don't want to broadcast over a generic topic foo because that would overcrowd the channel and subscribers would be required to filter out a large number of irrelevant events.

However, we can have an 2^256 number of unique topics... resulting in the need for a means of garbage collecting older unused channels.

notbdu commented 6 years ago

Any update on the possibility of this type of functionality making it into a future release?

kozlovic commented 6 years ago

What do you consider an "unused" channel. How do you want the server to delete a channel simply because there is no new incoming message or no subscription at the moment? I would not want the server to delete one of my channel just because there has not been activity for a while.

There is a branch that supports channel deletion through monitoring endpoint. We did not merge this into master because we are not sure this is the best administrative approach to delete a channel. But so far, I did not find a compelling solution to this problem.

notbdu commented 6 years ago

I was thinking that "unused" could be specified via a cache-like eviction strategy (e.g. LRU). On the other hand, I think you're right... it might not make sense to add a housekeeping endpoint just to satisfy a few very specific use cases.

Btw, we are planning to work around this limitation.

kozlovic commented 6 years ago

To all interested.. We went with a new channel limit approach. If the limit is specified and the channel does not have any subscription and no new message for at least the specified duration, the server will delete the channel. Feedback welcomed.

jmwilkinson commented 6 years ago

Is it difficult to add the manual delete functionality? This would be a very useful feature.

kozlovic commented 6 years ago

@jmwilkinson It is not that it is difficult, is that we don't have a good way to instruct the server to delete a given channel. The first approach was to use a POST on the http monitoring endpoint, but this could have caused some security issues. In other words, the difficulty is finding an administrative way to do it that all our users agree on.

jmwilkinson commented 6 years ago

Could we add a config option to allow the deletion of channels through the monitoring endpoint (delete:channel/id?), and default to off? Until there is a good security story and understanding of requirements, if security is an issue then you just leave it off?

kozlovic commented 6 years ago

That's what I originally did: there is still this branch out there: https://github.com/nats-io/nats-streaming-server/compare/remove_channel

jmwilkinson commented 6 years ago

That makes sense to me. I understand the trepidation @soulne4ny has around separation of concerns, but until there is a more comprehensive story around administrative tasks, this really seems to be the way to go.

On a related note, perhaps we could leverage pieces of the "management api" approach that elasticsearch took?

jextrevor commented 4 years ago

To all interested.. We went with a new channel limit approach. If the limit is specified and the channel does not have any subscription and no new message for at least the specified duration, the server will delete the channel. Feedback welcomed.

So, from what I understand this does not take into account whether there are non-acked messages currently on the channel @kozlovic?

kozlovic commented 4 years ago

The concept of ack'ed messages is not tied to a channel, but to a subscription. Meaning that messages are stored in the channel regardless of subscription interest, and therefore are not remove from the channel based on subscription ack'ing or not. This is not a message queue.

Suppose a publisher sends 10 messages on a new channel "foo". There is no subscription at that time. The channel is automatically created and has now 10 messages. Subscriptions can be created at any point in time after that and when they are they provide a starting point in the channel. It is based on sequence but from application perspective can be "time" or last received, new only, etc.. Many subscriptions can be created that will "feed" from this channel, and each subscription can have its own acking state if you will (some may have unacked messages - that is, received by the application but no ack sent back to the server, while some may have ack'ed all messages that have been received). But those 10 messages will still stay in the channel, regardless if all subscriptions have "ack'ed" their messages. This is why you can start a subscription at any time and ask to "replay" the content (or part of) the channel.

The inactivity property of the channel therefore means that if there is no active subscription, and no new message for that amount of time, the channel will be deleted. Note that if there are durable subscriptions, but currently offline, the channel would still be deleted (again, assuming no new activity for the configured period).

jextrevor commented 4 years ago

@kozlovic Makes sense. Thanks for the in-depth explanation

nats-io / nats-streaming-server

Proposal: Support Deletion of a Channel #358