Add hook to modify zmq.Context

arv1ndn commented 1 year ago

expose tunable ZMQ_MAX_SOCKETS
expose tunable IO_THREADS

welcome[bot] commented 1 year ago

Thanks for submitting your first pull request! You are awesome! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly. welcome You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

kevin-bates commented 1 year ago

Hi @arv1ndn - thanks for the pull request. The changes look good, but I believe the validator's definition needs to be relative to the MultiKernelManager class hierarchy and must reside within the RemoteMappingKernelManager class definition. While its hard to determine if the test failures are due to this, let's make this change and see how the tests fair after that.

Did you find that these changes (as they are now) worked in your env? Again, because EnterpriseGatewayConfigMixin is not a mixin on the RemoteMappingKernelManager (only a mixin on RemoteKernelManager), I'd be surprised that setting either environment variable triggered the expected behavior.

kevin-bates commented 1 year ago

Hmm, I'm not sure I gave correct advice regarding the location of the validator. In looking into this a bit, I see that context is defined on BOTH the MultiKernelManager and the KernelManager. It comes into play on the MultiKernelManager presumably when MKM.shared_context is True. When the validator is defined within the config mixin, I do see it called, but only when a kernel is created (and every time a kernel is created) whereas I don't see the validator invoked when defined in RemoteMappingKernelManager and relative to the MKM.

If we find that we need to define the validator in the mixin file, then we need to change how the shared_context boolean is accessed since the KernelManager (RemoteKernelManager from EG's perspective) doesn't define that trait. As a result, that access would need to change to self.parent.shared_context.

I would also recommend we only update the socket and thread counts when the current values (I'm assuming they have corresponding getters) are different than the desired values. This way, we're only adjusting the context instance once in a shared context environment. As a result, we should only log that the change is happening when a change happens.

I suspect the fact that the validator is not being called when its defined in RemoteMappingKernelManager may be a side effect of all the class changes and aliasing that occurred with the corresponding Async MKM classes were added (although the individual kernel manager class hierarchy went through a similar exercise, so I'm not convinced that's the issue).

Anyway, I'm spending some time with this because I'd like to include these changes in our 3.0 release and want to get the first (and hopefully only) release candidate built with this PR. Please let me know what you learn in your testing with this.

kevin-bates commented 1 year ago

ok, I believe the following approach may be better. Rather than hook the validator, instead, we can extend the default method that's defined in jupyter_client.multikernelmanager.py in the RemoteMappingKernelManager:


    def _context_default(self) -> Context:
        zmq_context = super()._context_default()
        if self.shared_context:
            # pyzmq currently does not expose defaults for these values, so we replicate them here

            # libzmq/zmq.h: ZMQ_MAX_SOCKETS_DLFT = 1023
            cur_max_sockets = zmq_context.get(MAX_SOCKETS)
            desired_max_sockets = int(os.getenv("EG_ZMQ_MAX_SOCKETS", 1023))
            if desired_max_sockets > cur_max_sockets:
                zmq_context.set(MAX_SOCKETS, desired_max_sockets)
                self.log.info(f"Set ZMQ sockets to {zmq_context.MAX_SOCKETS}")

            # libzmq/zmq.h: ZMQ_IO_THREADS_DFLT = 1
            cur_io_threads = zmq_context.get(IO_THREADS)
            desired_io_threads = int(os.getenv("EG_ZMQ_IO_THREADS", 1))
            if desired_io_threads > cur_io_threads:
                zmq_context.set(IO_THREADS, desired_io_threads)
                self.log.info(f"Set ZMQ IO threads to {zmq_context.IO_THREADS}")

        return zmq_context

When a shared context is configured, I'm only seeing this code called once across multiple kernel invocations and have confirmed that subsequent kernel managers are using the original (shared) context.

I haven't figured out why the validator is not being called while its corresponding "defaultor" is. The traitlets package can get pretty involved and mysterious in what it does. For example, in looking at this, I suspect the context instance is created when it's touched when checking self.context.closed, and really, never see the next line invoked because the construction of the context produces an instance that is not closed. So I believe that code exists in case the shared context were to somehow get closed during its existence.

(As a result, we may want to update the title to something like Add hook to modify zmq.Context or such.)

I hope this helps.

arv1ndn commented 1 year ago

@kevin-bates The former approach did work after implementing the changes you suggested. I also saw that that path gets exercised on each new kernel creation, which seemed a bit wasteful.

Your idea of overriding the creation of the default context seems like a much better approach. Let me give it a try, will amend the PR accordingly.

arv1ndn commented 1 year ago

Ok it looks like the code gets called exactly once the first time a kernel creation request is received. The context is reflecting the new value for ZMQ_MAX_SOCKETS on subsequent calls to create kernels.

kevin-bates commented 1 year ago

Hi @arv1ndn - do you expect there to be any other changes? If not, I'm fine moving forward with merging this. Thank you for the quick follow-ups!

kevin-bates commented 1 year ago

@arv1ndn - Actually, these envs should probably get (minimally) documented in https://jupyter-enterprise-gateway.readthedocs.io/en/latest/operators/config-add-env.html#additional-environment-variables. I'd be happy to push a commit with these changes if you're okay with that.

arv1ndn commented 1 year ago

@kevin-bates no further changes at the moment.

We are running some experiments with gcsfuse rather than NFS as the directory to keep kernel files... We have an endpoint that responds to a python client to create new kernels using a docker image URL. The endpoint creates some standard shapes with the same image (Eg 1c/3g, 2c/8g etc...). Our users can then use the kernel with notebooks as well as in our orchestration system (both use EG). Gcsfuse potentially allows us to use the same bucket as the readwritemany storage volume across projects (we keep different environments in different projects).

Our bottleneck currently seems to be around 1500 pods. The IO overhead of keeping all these pods alive seems to starve the apps ioloop of CPU. I'll keep you updated on progress.

Another issue we are working on is the subprocess call when creating kernels. It does not lend itself well to burst requests where the EG is hit with over 800 kernel creation calls within a short time. Again, I'll keep you updated.

kevin-bates commented 1 year ago

If you could allow maintainers to push to your branch, I'll push the changes. I also adjusted a couple of things to eliminate the horizontal scrollbars...

arv1ndn commented 1 year ago

@kevin-bates added you as a collaborator to the project. Should I be doing something else?

kevin-bates commented 1 year ago

We are running some experiments with gcsfuse rather than NFS as the directory to keep kernel files... We have an endpoint that responds to a python client to create new kernels using a docker image URL. The endpoint creates some standard shapes with the same image (Eg 1c/3g, 2c/8g etc...). Our users can then use the kernel with notebooks as well as in our orchestration system (both use EG). Gcsfuse potentially allows us to use the same bucket as the readwritemany storage volume across projects (we keep different environments in different projects).

Very cool. How are you configuring this? Do you massage the kernel-pod.yml template with this information and/or convey things from the client via KERNEL_ values?

Our bottleneck currently seems to be around 1500 pods. The IO overhead of keeping all these pods alive seems to starve the apps ioloop of CPU. I'll keep you updated on progress.

Ok - thank you. Would you be able to partition kernel requests across different EG instances using a reverse proxy to select the EG instance based on some criteria? Then limit the number of pods on a given EG instance?

Another issue we are working on is the subprocess call when creating kernels. It does not lend itself well to burst requests where the EG is hit with over 800 kernel creation calls within a short time. Again, I'll keep you updated.

I assume you're referring to the subprocess call down in jupyter_client, correct? "800 starts within a short time" - wow. What might be an alternative here - warm-started pods in a pool?

Are you using a Jupyter Lab front end and each user has their own "notebook server"?

Let me know how you want to proceed regarding the doc updates.

kevin-bates commented 1 year ago

I pushed the changes (sorry, I brain-farted the syntax and created a new branch at first that I removed).

arv1ndn commented 1 year ago

We are running some experiments with gcsfuse rather than NFS as the directory to keep kernel files... We have an endpoint that responds to a python client to create new kernels using a docker image URL. The endpoint creates some standard shapes with the same image (Eg 1c/3g, 2c/8g etc...). Our users can then use the kernel with notebooks as well as in our orchestration system (both use EG). Gcsfuse potentially allows us to use the same bucket as the readwritemany storage volume across projects (we keep different environments in different projects).

Very cool. How are you configuring this? Do you massage the kernel-pod.yml template with this information and/or convey things from the client via KERNEL_ values?

Yes, we added some logic into the kernel-pod.yaml.j2 and mount it in a separate path eg "/usr/share/jupyter/template..." The kernel.jsons the endpoint creates have additional env vars that are processed by the template engine to set the correct usernames, display name, service accounts, cpu, memory etc.

Our bottleneck currently seems to be around 1500 pods. The IO overhead of keeping all these pods alive seems to starve the apps ioloop of CPU. I'll keep you updated on progress.

Ok - thank you. Would you be able to partition kernel requests across different EG instances using a reverse proxy to select the EG instance based on some criteria? Then limit the number of pods on a given EG instance?

We haven't explored the reverse proxy idea yet. Currently we reconnect to running kernels when the endpoint restarts. This requires that we get the response on /api/kernels/. If the reverse proxy can associate the kernel_id from the /api/kernels POST response with the EG and then maintain that relation so that it can correctly vector the request to the correct EG, that could work.

Another issue we are working on is the subprocess call when creating kernels. It does not lend itself well to burst requests where the EG is hit with over 800 kernel creation calls within a short time. Again, I'll keep you updated.

I assume you're referring to the subprocess call down in jupyter_client, correct? "800 starts within a short time" - wow. What might be an alternative here - warm-started pods in a pool?

Correct, potentially we could change the launch mechanics on our end so that when a user launches a big job, we pre-warm the kernels before submitting the jobs.

Are you using a Jupyter Lab front end and each user has their own "notebook server"? yes, so we have teams and users potentially belong to multiple teams. Each team has specific privileges, restrictions etc all tied to the team's service account. We run an EG per team in a dedicated namespace and both - the endpoint above and jupyterhub are configured with these teams and the EG endpoints.

Users are presented with their team choices on login. Based on their selection, we spin up a notebook server for them in the jhub namespace. Kernels launched from these notebook servers use the EG configs to request a kernel from the correct EG instance (in a different project/namespace, running with the right service account etc)

Let me know how you want to proceed regarding the doc updates. Trust this is resolved?

welcome[bot] commented 1 year ago

Congrats on your first merged pull request in this project! :tada: Thank you for contributing, we are very proud of you! :heart:

jupyter-server / enterprise_gateway

Add hook to modify zmq.Context #1154