Open dharhas opened 2 years ago
This will definitely need investigation. Googling breifly I don't see a staightforward way to get the jupyter kernel name without javascript.
This also gets into the reproducibility angle and dashboarding. i.e. knowing the kernal being used and putting it in the notebook metadata can help with reproduction and also with picking a good default environment for dashboard sharing.
There seems to be a default config yaml that can be loaded in here https://gateway.dask.org/configuration-user.html#default-configuration
which has the luster options in it -- we might be able to set the env programmatically in there I think... but don't know how that interferes with the gateway.cluster_options()
c.c @costrouc I think this might help as well https://docs.dask.org/en/latest/deploying-kubernetes-helm.html?highlight=conda%20environemt#matching-the-user-environment
We can set the filesystem/dask
env as default, which can be overwritten easily using the cluster options GUI. The only issue is that cant automatically detect the active environment with this... unless we do something during the deployment (aka bash with conda active env variable) to dynamically update this file .config/dask/gateway.yaml
edit* It seems to be possible using $CONDA_DEFAULT_ENV
As we are using dask_gateway to perform this and the dask permission system from Keycloak we should be okay with the default env containing dask during this "inspection"
HI @Chris Ostrouchov about the Gateway default option for cluster env, what do you think of using the above approach?
cc @viniciusdc for visibility
I forgot about this, I will open a PR as this is now easier to achieve using conda-store endpoints
Feature description
Currently, the dask gateway cluster option defaults to the first environment available rather than the environment actually being used by the notebook. If the environment doesn't have dask in it then the next stage just hangs. It is really easy not to notice that the environment being used by dask-gateway is the wrong environment when running through all the cells.
I propose that we ensure the the default conda environment for dask be the one being actively being used by the jupyter kernel since that is the most sensible default.
In the example below we see that the
filesystem/dashboard
env is the default, even though the notebook is runningfilesystem/dask
Value and/or benefit
Makes using Dask-Gateway less error prone and improves usability.
Anything else?
No response