Closed IwonaZwierzynska closed 3 months ago
Hi @IwonaZwierzynska, thank you for providing the extra details here! Seems like this is actually a duplicate of https://github.com/microsoft/vscode-data-wrangler/issues/255.
For some more context, we don't currently support loading PySpark variables but the Jupyter launch button shows it as something that can be launched because the type name happens also to be "DataFrame". We plan to both make the error message more clear as well as investigate the feasibility of PySpark support here.
For now, my recommendation is to convert the Spark DataFrame to Pandas using pdf1 = df1.toPandas()
and view it that way. Hope this helps!
Hi @pwang347,
I get the exact same error message as @IwonaZwierzynska. In my case, it does not seem to be caused by PySpark not being supported, since my dataframe already is a pandas dataframe. I get the error message in the Interactive Mode.
Hi @NiKoenig, thank you for letting me know.
Could you please try to reproduce the issue again with the developer console open and check to see if there are any related error messages?
You can open the developer console as follows:
Thanks!
Hi @pwang347
I get 4 error messages in the Toggle Developer Console (the first two are very long, sorry):
Thank you!
Thank you very much for your response :-)!
@NiKoenig seems like you are running into the same issue here: https://github.com/microsoft/vscode-data-wrangler/issues/270 (also see https://github.com/microsoft/vscode-jupyter/issues/15969)
It seems like the Jupyter kernel API is somehow not allowing us to access the kernel. Do you recall accepting/rejecting a popup window asking if Data Wrangler should be allowed access to the kernel?
Hi @pwang347 this is exactly the problem I was having, thank you for pointing out this issue to me! :) I didn't get any popup window asking if Data Wrangler should be allowed access to the kernel. Is there a way to grant access now, e.g., in the settings? If not, I will just follow the discussion on the Jupyter side and hope that they find a solution there.
I just do this and it works for me https://github.com/microsoft/vscode-data-wrangler/issues/270#issuecomment-2324498045
I have the same problem however non of the fixes i found worked. Everything is similar to this and the api enable in 270 did not work.
here is the error log:
Hi @theice123, does this issue reproduce on a new Python file like the following?
li = [1,2,3]
print(li) # <- breakpoint here and launch `li` from debugger
If the above does not work, could you also check the following:
Thanks!
Environment data
Expected behaviour
Displaying PySpark DataFrame in a form of table
Actual behaviour
Error: Could not retrieve variable df1 from the Jupyter extension. Please file an issue on the Data Wrangler GitHub repository. by test debugging
The same error occurs, when I would like to open DataFrame from Jupyter Notebook (from variables section)
Steps to reproduce:
Trying to open "View Value in Data Viewer" with context menu on df1 (PySpark DataFrame) while debugging tests with pytest
Logs
Output for
Jupyter
in theOutput
panel (View
→Output
, change the drop-down the upper-right of theOutput
panel toJupyter
)``` XXX ```
Details
Always
PySpark DataFrame with strings and numbers
I am not able to open only DataFrames from PySpark, lists (as an example) work correctly
No.
It does not matter, if it is in an interactive Window, JupyterNotebook or directly in VS I am not able to display DataFrames.
Thank you very much for your help in advance.
Best regards, Iwona