Open tom-mcclintock opened 2 years ago
I have a similar issue in my own conda container where the default conda env is always the base env, but I cannot switch to my conda env in the notebook.
!conda env list
# conda environments:
#
base * /home/ubuntu/miniconda
pipeline /home/ubuntu/miniconda/envs/pipeline
!conda activate pipeline
CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run
$ conda init <SHELL_NAME>
Currently supported shells are:
- bash
- fish
- tcsh
- xonsh
- zsh
- powershell
See 'conda init --help' for more information and options.
IMPORTANT: You may need to close and restart your shell after running 'conda init'.
I have a similar issue in my own conda container where the default conda env is always the base env, but I cannot switch to my conda env in the notebook.
!conda env list # conda environments: # base * /home/ubuntu/miniconda pipeline /home/ubuntu/miniconda/envs/pipeline
!conda activate pipeline CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'. To initialize your shell, run $ conda init <SHELL_NAME> Currently supported shells are: - bash - fish - tcsh - xonsh - zsh - powershell See 'conda init --help' for more information and options. IMPORTANT: You may need to close and restart your shell after running 'conda init'.
I am also running into this issue. Were you able to fix it?
I was able to make use of the example. But I also started mounting from /root, as I did not see any users within these images. This also the difference to @tom-mcclintock.
config_app = {
"AppImageConfigName": "conda-env-kernel-config",
"KernelGatewayImageConfig": {
"KernelSpecs": [
{
"Name": "conda-env-venv-py",
"DisplayName": "Python [conda env: venv]"
}
],
"FileSystemConfig": {
--> "MountPath": "/root", <--
"DefaultUid": 0,
"DefaultGid": 0
}
}
}
My domain update json looks like this
config_domain = {
"DomainId": domain_id,
"DefaultUserSettings": {
"KernelGatewayAppSettings": {
"CustomImages": [
{
"ImageName": "conda",
"AppImageConfigName": "conda-env-kernel-config",
}
]
}
}
}
With that, I'm able to import packages within Sagemaker Studio. In general the Docker files are not in line with Docker best practices
Some observations from testing last/this week:
base
conda env name, python3
kernel name, /root
mount point, 0:0
UID:GID) does seem to work for me - so maybe this particular issue is now resolved?Auto-detection of non-base kernel envs does seem to work for me as the sample README describes: E.g. if I create a conda env mycoolenv
in the image, then I can set up SageMaker KernelSpec Name conda-env-mycoolenv-py
. I logged some feedback on the kernel spec doc page to suggest clarifying this naming on the "Kernel discovery" section.
I find we can also manually register conda envs as notebook kernels in the Dockerfile using something like the below - but it's a bit pointless because I just end up with 2 kernels visible in Studio: The manually created one and the auto-detected one.
RUN bash -c 'source activate mycoolenv && python -m ipykernel install --name mycoolenv --display-name "Conda mycoolenv"'
I do see the same issue as @tday that, when using this setup, image terminals are unable to switch conda envs, which I think is related to the user situation below:
pip install
/ conda install
extra packages ad-hoc? Given the architecture of Studio and the boundaries in the shared security responsibility model, is the extra isolation of running non-root helpful?/home/sagemaker-user
and /root
replace user home directories in the container image with whatever's in the Jupyter working directory. In some cases this might be useful (e.g. propagating settings correctly between server and kernel), but it does mean any user settings files defined in the kernel container will get obliterated.
ipykernel install --user ...
or conda install --prefix ...
to install kernels and conda envs under your user's home folder is no good if the entire home folder gets substituted at run-time.~/.bashrc
, .bash_profile
, etc get obliterated too. Probably there's some way of getting this working but I haven't dived deep yet?1000:100
user without actually creating it in the image (like done here)? Maybe that was causing issues with not being able to auto-discover conda kernels before?I did manage to get a notebook-user-editable (i.e. can %pip install
) custom image working using a non-root user and a non-base conda env, by making sure my 1000:100
user got permissions to edit the /opt/conda
folder.
Maybe we could try to have 2 samples to capture both a simple, root+base-based configuration, and a complex, non-root/non-base option separately? As seems to me like it would over-complicate the initial getting started to dive straight into that? I think for this issue the initial bug itself seems resolved.
[3/3] RUN conda env update -f environment.yml --prune: 0.811 Collecting package metadata (repodata.json): ...working... done 77.96 Solving environment: ...working... Killed
Dockerfile:4
2 | 3 | COPY environment.yml . 4 | >>> RUN conda env update -f environment.yml --prune 5 |
ERROR: failed to solve: process "/bin/sh -c conda env update -f environment.yml --prune" did not complete successfully: exit code: 137
After following the steps listed here exactly I began a SageMaker Studio session. After creating selecting the custom image and beginning a console I received the following error:
The
Dockerfile
andenvironment.yml
are identical to the example. Here is theapp-image-config-input.json
file:And here is the anonymized
create-domain-input.json
contents:I used
IMAGE_NAME=conda-test-kernel
throughout. Other things to note:aws sagemaker describe-image-version
shows"ImageVersionStatus": "CREATED"
aws sagemaker describe-app-image-config
gives back all the expected informationI believe the issue is that
conda
doesn't automatically follow the kernelspec. This quirk needs to be covered in the README for this example. Unfortunately I haven't figure out the solution yet. Any help is appreciated.