There are a couple of possible approaches to this:
The worker simply instructs all ComfyUI processes to unload models using the usual mechanisms (UNLOAD_MODELS_FROM_RAM message).
This has the side effect of retaining ~200mb of RAM per process
The safety process does not currently have an unload mechanism. Either one would be need to be implemented, or the safety process could alternatively be stopped - which would require the worker not attempt to automatically restart it as is currently the case.
The worker shuts down all sub-processes, including the safety process.
It may be sufficient to shut all but one inference process and unload it's models, which would reduce ramp up time once maintenance mode is cleared.
Again, the worker would need a flag to avoid reloading these halted processes once they are halted for maintenance.
There are a couple of possible approaches to this:
UNLOAD_MODELS_FROM_RAM
message).