microsoft / Spark-Hive-Tools

This is for issue/feedback tracking on Spark & Hive Tools
Creative Commons Attribution 4.0 International
6 stars 9 forks source link

Plugin shows "Jupyter not installed" even "Requirement already satisfied" is displayed for every package #21

Closed HansjoergW closed 3 years ago

HansjoergW commented 3 years ago

Hi

I am managing my python environments with conda. So i created a new conda environment (python 3.7) to test "Spark-Hive-Tools". I installed pyspark version 2.4.1 with conda install and i installed virtualenv with pip.

I activated this new conda environment in the anaconda prompt and started visual source code directly out of this command shell.

I installed the spark and hive tools extension and that seemed to worked. I checked the output of the installation and made sure that there was no error during package installation. However, In the end it displayed "Jupyter not installed", but in the output it is clearly visible that jupyter is installed

I created a file with the example code from https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/vscode-tool-synapse.

I opened the context menu on the file (right click) and selected "Synapse: PySpark Interactive" In the output i see the following:


[2021-2-24:8:40:54] [Info] Checking installation:
[2021-2-24:8:40:56] [Info] Jupyter not installed
[2021-2-24:8:40:56] [Info] Installing PySpark interactive virtual environment ...
[2021-2-24:8:40:56] [Info] Exec python, with args: -m,virtualenv,C:\Users\U118187\.msvscode.hdinsight\hdinsightJupyter
..[2021-2-24:8:40:58] [Info] created virtual environment CPython3.7.9.final.0-64 in 1389ms
....

a lot of lines containing

[2021-2-24:8:41:1] [Info] Requirement already satisfied: ...

no error ..

and then there is a warning

[2021-2-24:8:41:7] [Info] Exec C:\Users\HJUser\.msvscode.hdinsight\hdinsightJupyter\Scripts\jupyter.exe, with args: kernelspec,install,--prefix=C:\Users\HJUser\.msvscode.hdinsight\hdinsightJupyter,c:\users\HJUser\.msvscode.hdinsight\hdinsightjupyter\lib\site-packages\sparkmagic\kernels\pysparkkernel
.[2021-2-24:8:41:8] [Info] [InstallKernelSpec] WARNING | Installing to C:\Users\HJUser\.msvscode.hdinsight\hdinsightJupyter\share\jupyter\kernels, which is not in ['C:\\Users\\HJUser\\.msvscode.hdinsight\\hdinsightJupyter\\jupyter_paths\\data\\kernels', 'C:\\ieu\\anaconda3\\envs\\remotespark\\share\\jupyter\\kernels', 'C:\\ProgramData\\jupyter\\kernels', 'C:\\Users\\HJUser\\.ipython\\kernels']. The kernelspec may not be found.

and finally

[2021-2-24:8:41:10] [Info] Checking installation:
[2021-2-24:8:41:13] [Info] Jupyter not installed

If i choose "Spark:PySpark Interactive", it shows almost the same output with the following differences

it starts with.. (note that it shows "Jupyter already installed" and then "Jupyter not installed")

[2021-2-24:9:27:44] [Info] Check Jupyter installation:
..[2021-2-24:9:27:46] [Info] Jupyter already installed
..[2021-2-24:9:27:47] [Info] Sparkmagic already installed
.[2021-2-24:9:27:48] [Info] Jupyter not installed
[2021-2-24:9:27:48] [Info] Installing PySpark interactive virtual environment ...

and ends with

[2021-2-24:9:28:1] [Info] Check Jupyter installation:
.[2021-2-24:9:28:2] [Info] Jupyter already installed
.[2021-2-24:9:28:3] [Info] Sparkmagic already installed
.[2021-2-24:9:28:3] [Info] Jupyter not installed
.................................................... <continues to print dots>

The Status bar shows "Validating" and it seems that this never ends..


VS Code Info: Version: 1.53.2 (user setup) Commit: 622cb03f7e070a9670c94bae1a45d78d7181fbd4 Date: 2021-02-11T11:48:04.245Z Electron: 11.2.1 Chrome: 87.0.4280.141 Node.js: 12.18.3 V8: 8.7.220.31-electron.0 OS: Windows_NT x64 10.0.18363

Spark & Hive Tools Version 1.1.13

Do you have any suggestions what i could do? Or is the only way installing plain python and not using a conda environment?

Thanks Hansjörg

Poytr1 commented 3 years ago

Hi @HansjoergW ,

I've noticed that jupyter kernels somehow got installed under the path: C:\ieu\anaconda3\envs\remotespark\share\jupyter\kernels\ which ideally should be installed at: C:\Users\U118187\.msvscode.hdinsight\hdinsightJupyter\share\jupyter\kernels\.

Could you please check both paths to see if there are synapse_pyspark and pysparkkernel under it? If there're kernel folders under the first directory, trying to delete them, then launching the extension again. If this doesn't work, I think you can try to comment the three lines under both functions _kernelCheckSynapse and _kernelCheck as a workaround:

image

please let me know if this helps. Thanks

HansjoergW commented 3 years ago

Hi @Poytr1

Removing the kernels from C:\ieu\anaconda3\envs\remotespark\share\jupyter\kernels\ did solve the problem. It seems as if the plugin works now as expected.

Thank you for help!