microsoft / azuredatastudio

Azure Data Studio is a data management and development tool with connectivity to popular cloud and on-premises databases. Azure Data Studio supports Windows, macOS, and Linux, with immediate capability to connect to Azure SQL and SQL Server. Browse the extension library for more database support options including MySQL, PostgreSQL, and MongoDB.
https://learn.microsoft.com/sql/azure-data-studio
MIT License
7.56k stars 899 forks source link

How to select a python kernel from an existing conda env? #19939

Closed aaelony-aeg closed 2 years ago

aaelony-aeg commented 2 years ago

Issue Type: Feature Request

It would be nice to be able to choose a Python kernel for a notebook that is based on an existing Conda environment.

Is this currently possible?

Azure Data Studio version: azuredatastudio 1.37.0 (d904740d93d7df76a0ba361f20e4351813b57645, 2022-06-14T19:53:12.357Z) OS version: Darwin x64 21.5.0 Restricted Mode: No

chlafreniere commented 2 years ago

@corivera can you help?

corivera commented 2 years ago

You can choose what environment you want to run the notebook with using the Configure Python for Notebooks command from the command palette. From the opened dialog, click Edit (if already configured), then check the Use Existing Python Installation option, and then set the path to anaconda's install folder (this may be auto-populated in the dropdown if it's in a well-known place). Then click through the dialog to install any needed dependencies for the kernel you want to use.

aaelony-aeg commented 2 years ago

I found the Command Palette and Configure Python for Notebooks command there you mentioned.

I click the Edit button, select Use existing Python installation. There is a dropdown box there populated with /opt/homebrew/anaconda3/. I would like to choose from envs below /opt/homebrew/anaconda3/envs/ but using the Browse button, it doesn't seem possible to navigate to /opt/homebrew/anaconda3/envs/.

Is there something I am missing?

Thank-you

aaelony-aeg commented 2 years ago

Update: I was able to drag an alias to the homebrew directory into the Finder's sidebar. Once that is done, I am able to go back to Azure Data Studio and choose the env I want.

Then I click Next and install it. It appears successful, but I still am not able to select as python kernel the name of my conda env.

Is this expected?

corivera commented 2 years ago

Does the conda environment show up if you enable Show All Kernels in the user settings?

aaelony-aeg commented 2 years ago

Interesting... I did not know of the Show All Kernels user setting. I found this setting and checked the checkmark to enable it. Copying to JSON, it is now listed as "notebook.showAllKernels": true.

That said, however, no... the only kernels in the dropdown are the defaults (SQL, PySpark, Spark | Scala, Spark | R, Python 3, PowerShell).

The name of my conda env is "py39" and it does not show up in the dropdown to choose a kernel.

corivera commented 2 years ago

We probably don't support it yet, then. We still have room for improvement in automatically detecting kernels. We'll look into enabling this scenario in the future.

aaelony-aeg commented 2 years ago

The motivation for this for me is that Azure Data Studio is able to query the database with the appropriate Two Factor Auth paradigm, whereas from a MacBook (Apple M1 chip) running Monterey I have not yet found a MSSQL driver and auth configuration combination outside of Azure Data Studio that works to query (e.g. programmatically from pyodbc or R)

Thanks for attending to this.

aaelony-aeg commented 2 years ago

addendum: The Tasks panel states a green checkmark and "Installing Notebook dependencies succeeded", but in the Output panel there is a listed error. The output is long, but here is hopefully the relevant part.

stdout: Requirement already satisfied: pure-eval in /opt/homebrew/anaconda3/envs/py39/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->jupyter>=1.0.0) (0.2.2)
    stdout: 
    stdout: Requirement already satisfied: executing in /opt/homebrew/anaconda3/envs/py39/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->jupyter>=1.0.0) (0.8.3)
    stdout: 
    stdout: Installing collected packages: widgetsnbextension, qtpy, jupyterlab-widgets, qtconsole, jupyter-console, ipywidgets, jupyter
    stdout: 
    stderr: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
    stderr: sparkmagic 0.20.0 requires mock, which is not installed.
    stderr: sparkmagic 0.20.0 requires nose, which is not installed.
    stderr: hdijupyterutils 0.20.0 requires mock, which is not installed.
    stderr: hdijupyterutils 0.20.0 requires nose, which is not installed.
    stderr: autovizwidget 0.20.0 requires plotly>=3, which is not installed.
    stderr: 
    stdout: Successfully installed ipywidgets-7.7.1 jupyter-1.0.0 jupyter-console-6.4.4 jupyterlab-widgets-1.1.1 qtconsole-5.3.1 qtpy-2.1.0 widgetsnbextension-3.6.1
    stdout: 
Notebook dependencies installation is complete
... Ensuring /Users/avramaelony/Library/Jupyter/instances/55d63447-2cf6-4c77-9687-ad0d4a5d931e exists
... Ensuring /Users/avramaelony/Library/Jupyter/instances/55d63447-2cf6-4c77-9687-ad0d4a5d931e/config exists
... Ensuring /Users/avramaelony/Library/Jupyter/instances/55d63447-2cf6-4c77-9687-ad0d4a5d931e/data exists
... Ensuring /Users/avramaelony/Library/Jupyter/instances/55d63447-2cf6-4c77-9687-ad0d4a5d931e/config/custom exists
... Ensuring /Users/avramaelony/Library/Jupyter/kernels exists
... Starting Notebook server
    > "/opt/homebrew/anaconda3/envs/py39/bin/python3" "/private/var/folders/jc/5gc6fmf52cv6r88m6snhvvlr0000gn/T/AppTranslocation/1BFD8CCC-AB18-46BB-9D53-0D3E9F2B9651/d/Azure Data Studio.app/Contents/Resources/app/extensions/notebook/resources/pythonScripts/startNotebook.py" --no-browser --ip=127.0.0.1  --no-mathjax --notebook-dir "/Users/avramaelony" --NotebookApp.token=d0efcfc97b9d81533db9a3e5abb3fae636468c71d858d59f
... Jupyter is running at http://localhost:8890/?token=d0efcfc97b9d81533db9a3e5abb3fae636468c71d858d59f
Notebook dependencies installation is in progress
Notebook dependencies installation is complete
    > "/opt/homebrew/anaconda3/envs/py39/bin/python3" "/private/var/folders/jc/5gc6fmf52cv6r88m6snhvvlr0000gn/T/AppTranslocation/1BFD8CCC-AB18-46BB-9D53-0D3E9F2B9651/d/Azure Data Studio.app/Contents/Resources/app/extensions/notebook/resources/pythonScripts/startNotebook.py" stop 8890
    stdout: Shutting down server on 8890...
    stdout: 
corivera commented 2 years ago

Were you able to run cells successfully after that install completed?

aaelony-aeg commented 2 years ago

Run cells under which kernel? It does not offer me the option to choose the kernel I want (i.e. the conda kernel I named py39).

I can run cells under the Python 3 kernel if they make sense to what Python 3 knows about, but it has no knowledge of libraries exclusive to the conda env named "py39" that would be available if it was using my conda env "py39". For example, the conda kernel I have has an installation of the bambi library whereas the Python 3 kernel in the list does not.

corivera commented 2 years ago

I was wondering if the default Python kernel would run to rule out issues with the install itself. The error message mentioned installing packages via pip, though. If you scroll further up the notebook log there is it installing the required packages through pip instead of conda? That part sounds like a bug.

corivera commented 2 years ago

Our notebook setup currently assumes the install path is the root anaconda folder, so if it's not finding the conda executable in the py39 env folder it might be falling back to pip by mistake.

aaelony-aeg commented 2 years ago

Yeah, it looks like it is calling pip:

Notebook dependencies installation is in progress
    > "/opt/homebrew/anaconda3/envs/py39/bin/python3" -m pip install --user "jupyter>=1.0.0"
    stdout: Collecting jupyter>=1.0.0
    stdout: 
    stdout:   Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)

Agree it should be using conda or mamba. Although, there may be certain packages where a conda install was not possible and pip was used (though not the case here).

aaelony-aeg commented 2 years ago

It is also weird in that I am not sure there actually is an executable named conda.

% which conda
conda () {
        \local cmd="${1-__missing__}"
        case "$cmd" in
                (activate | deactivate) __conda_activate "$@" ;;
                (install | update | upgrade | remove | uninstall) __conda_exe "$@" || \return
                        __conda_reactivate ;;
                (*) __conda_exe "$@" ;;
        esac
}
corivera commented 2 years ago

It should be in the bin folder of the root Anaconda directory on non-Windows machines.

aaelony-aeg commented 2 years ago

I do see /opt/homebrew/anaconda3/condabin/conda*.

corivera commented 2 years ago

Which version of Anaconda is this?

aaelony-aeg commented 2 years ago
% conda --version
conda 4.13.0
corivera commented 2 years ago

Looks like we always install packages during setup with pip because we had issues with missing packages on conda before. Our whole Anaconda scenario all up could probably use an overhaul. If you activate the conda environment manually before launching ADS, and then select the Anaconda root folder as your Configure Python path, are you able to see your expected packages?

aaelony-aeg commented 2 years ago

Our whole Anaconda scenario all up could probably use an overhaul.

Agree. Also agree it may be nuanced due to macos, homebrew, and conda versions perhaps being slightly different.

If you activate the conda environment manually before launching ADS, and then select the Anaconda root folder as your Configure Python path, are you able to see your expected packages?

I'm not sure that has any effect, or I am not doing that properly. I've tried conda activate py39 && open /Applications/azure/Azure\ Data\ Studio.app which does not change anything.

The kernel dropdown still has no knowledge of the conda env names that exist on the machine.

corivera commented 2 years ago

We'll take a look at improving these scenarios then.

aaelony-aeg commented 2 years ago

Thank-you.

If it is useful, I think RStudio does a nice job of enabling this via their UI.

aaelony-aeg commented 2 years ago

Hello - after tinkering with this, I think I got it to work. In my case, the issue appears to have been that Azure Data Studio was not run from /Applications but a subdirectory /Applications/azure. Simply copying Azure Data Studio.app to /Applications and restarting finally triggered the kernel dropdown to find kernels found in jupyter kernelspec list.

Thank-you for looking into this issue and hopefully this thread may be useful to others with a same or similar problem.