pppoe-server: drainfile option.

jkroonza commented 2 years ago

This option allows for draining a pppoe-server instance, either for redistributing load over time, or potentially for upgrade paths.

Signed-off-by: Jaco Kroon jaco@uls.co.za

jkroonza commented 2 years ago

Not yet properly tested, looking for principle feedback, points to consider:

Should pppoe-server terminate once all sessions has been terminated?
could this same idea, assuming IP delegation is utilized, in combination with max sessions and session offset potentially be used for seamless upgrades?

Ie, set internal flag to drain, perform fork() + execve(), new instance starts serving new requests, old one drains, and we utilize max sessions in combination with offset to ensure that session ids do not overlap. IP delegation will be a must for this use-case or we'd need some external bitmap/shared memory for IP assignments (entirely possible, but lots of work for little benefit for us).

dfskoll commented 2 years ago

In principle, I'm not opposed to this. However, as we keep adding more and more things to the PPPoE server, I think it would make more sense to have a control socket over which we can pass commands. It's really not too hard to do because we already use the event framework to handle multiple file descriptors.

That way, we can tell the server to stop reacting to PADIs just by setting a global variable. And we can also query it to see how many active sessions there are, and so on.

Regards,

Dianne.

dfskoll commented 2 years ago

On Tue, 31 May 2022 23:22:28 -0700 Jaco Kroon @.***> wrote:

Should pppoe-server terminate once all sessions has been terminated?

I would say probably, unless we can come up with a scenario where we want to drain the server and then let it accept connections again.

could this same idea, assuming IP delegation is utilized, in combination with max sessions and session offset potentially be used for seamless upgrades?

Yes, but that's very complicated. If we're clustering PPPoE servers anyway, we can achieve seamless upgrade just by upgrading one node at a time. However, some people are very likely to keep their PPPoE sessions alive "forever", so at some point we are likely going to have to forcibly terminate sessions if we want an upgrade to complete in a reasonable amount of time. So I doubt it can ever be made completely seamless.

Regards,

Dianne.

jkroonza commented 2 years ago

Hi Dianne,

I'm in support of the control socket. This does imply a sort of "client" too or may I simply go with something like "use nc -U" to chat directly with a unix domain socket? This would have to be a non-blocking socket though and we'd need to buffer, which I'm OK with, is that acceptable to you? We can set the max line buffer size presumably to something fairly small like 128 chars.

To then answer the questions above:

This could be made configurable, so a very simple "set" command, where we can set using strings, so the command has the structure set option=argument, and the main set driver just maps option to a handler function, so handle_drain: set drain=(no|yes[,maxtime=???]|exit[,maxtime]). This should probably be made in a more generic way such that we can have a multi-tier centralized parser of sorts, so probably rather "set option value" style rather. That way we can re-use the code used for dispatching "set" to also dispatch to the various "options". I'll come up with a strategy. This was cooked very quickly yesterday morning based on the conceptual idea and I figured the code would better illustrate the point.

Other useful stuff to set, max sessions, offset.

We should also have a form of "show state ???" to be able to query status.

Regarding two, I still think it's useful for seamless upgrades where we don't forcibly terminate any connections. Frankly, our longest connection as of right now is 13 days, and it's not unreasonable to let that linger, we could even use a max count termination either as alternative or in combination with maxtime, so "set drain exit,maxtime=1440,maxterm=5" - the question is, should that be an OR or an AND? So it gets complicated quickly. A pipe() could also be used to indicate to the new child then using the control mechanism which session id values are already in use, and have those masked out in some way until they get released. If the pipe closes, close them all, if we go into upgrade state prior to this being closed, simply "splice" them through or something. But I suspect per-interface shared memory would be better, and if the upgraded one crashes, those session ids are just lost forever - we've got 65k so should be fine. If we use a bitmap that's 8KB of RAM per interface.

Incidentally, the kernel's mechanism for determining a session uniquely is the triplet (iface number, remote hw addr, session id). So we can in theory just follow suite (relates more to my other PR than this). For this I simply suggest we reserve two to four bits of the session identifier as a kind of instance id, question is how do we collaborate between multiple (4, 8 or 16, each with an absolute max of 16k, 8k and 4k clients, less two since session id 0 is reserved for discovery best I can tell and 0xffff is explicitly reserved in the RFC) instances to ensure they have unique instance id's on a per interface basis? Now we get back to a form of control daemon... which I'd prefer to avoid. Let me think on this, but if you know a pattern I could leverage ... please let me know.

Sorry for not getting the other promised stuff done, got pulled into another related project first which is a bit more urgent.

jkroonza commented 1 year ago

Closed since this is getting rebuilt on top of the control socket now.

dfskoll / rp-pppoe

pppoe-server: drainfile option. #15