saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
14.1k stars 5.47k forks source link

[FEATURE REQUEST] Allow minion to receive job even when it's temporarily disconnected from master during the job creation #61184

Open lukasraska opened 2 years ago

lukasraska commented 2 years ago

Is your feature request related to a problem? Please describe. Currently when new job is being created on master, it immediately publishes the job to all connected minions - effectively missing any minion that's not connected in that exact moment (due to network flap / service restart / etc). When this happens, currently only way to deteremine the job wasn't even executed on minion is to assume it with missing return on some timeout / via saltutil.find_job (or similar).

Describe the solution you'd like While due to Salt architectural decisions and possible transports this isn't possible to be implemented easily for all use-cases and config variants, it can be at least implemented for TCP transport and list targetting, because master has direct access to underlying TCP streams (https://github.com/saltstack/salt/blob/v3004/salt/transport/tcp.py#L1531 )

Describe alternatives you've considered One alternative is to implement minion to send event upon job receive (as suggested in SEP 17 - https://github.com/saltstack/salt-enhancement-proposals/blob/master/accepted/0017-job-ack-event.md ) and listen for such events on master, which would ultimately solve this issue for all transports as it could be independent of specific transport protocol.

Second alternative is to hook up to the existing saltutil.find_job logic in salt-cli and porting it directly to master, but I expect that to be much worse performance-wise.

Additional context Depending on Salt core team this could be treated as SEP, however in that case it would probably bring substantial refactoring to the core of master/minion configuration, which is probably not needed (users that would require this functionality can use TCP transport, if implemented)

Please Note If this feature request would be considered a substantial change or addition, this should go through a SEP process here https://github.com/saltstack/salt-enhancement-proposals, instead of a feature request.

whytewolf commented 2 years ago

So, after looking at this. This should definitely be a SEP. As it should be handled from more than just the tcp transport. This kind of functionality needs to be tackled from the prospect that it should work regardless of the transport used by the end user. As we move more towards plug-able transports. the transports need to be come more generic. and honestly we need to stop rigging functionality into different things. a transport should only be a transport. queuing methods shouldn't by tied to it.

@lukasraska Can you draw up a SEP for this and go through the SEP process? We on the core team would thank you.