Closed ronblum closed 1 month ago
I think I know what the issue is.
Pandas is discovered when ipykernel is first loaded, if we install pandas within the current session (eg with %pip install pandas) then even though you are able to use it, the internal ipykernel calls to pandas won't be able to use them as pd_ wasn't found when launching the kernel. A simple solution is to simply restart the kernel.
Here's what happens when ipykernel is launched the following is evaluated, and pd_
becomes None
as pandas is not installed. If you install pandas now with %pip install pandas
, you can run import pandas
and it will work, but pd_
won't be updated until you restart the kernel.
This behavior shouldn't be linux specific
I think we should move away from importing third party packages on startup the way we currently do, and instead have functions like get_pandas
that do the import each time, wrapped in a try-except.
It’ll mean that users don't have to restart after installing (most of the time; some packages still require a restart), and it’ll make kernel startup a little bit faster since we’ll only import third party packages when they’re needed.
Alternatively, @dfalbel mentioned on Slack:
I think this makes sense! I think it's also fine having to restart the kernel, but we should at least re-surface the error to the UI, with some tips that restarting might fix.
Do we feel 100% confident that we have identified the root cause here? I ask because in smoke tests, I am seeing occasional failures where the test double clicks a variables row and no data explorer opens. These tests pre-install all their dependencies.
I don't see any other reasons for that to happen. Can you try in the smoket ests if running %view df
returns an error?
Sure. I will try that. I also just put in retries in case double clicking on the variables row is the real problem.
It looks to me like click retries have solved the smoke test problem! Carry on :)
Should we open an issue for Wasim's suggestion to try-expect and/or surface the error per Daniel?
@jthomasmock I made #3800 but then closed it. I think we can use this issue to track the %pip install pandas
bug.
I've relabled this issue and added context to original comment.
I ended up approaching this by forwarding the error to the user instead of lazily loading pandas as it's a simpler change and also solves other possible issues during initialization that wouldn't be covered by simply loading pandas lazily. See https://github.com/posit-dev/positron/pull/4122 for more info.
Positron Version(s) : 2024.07.0-113
OS Version : OSX
Error message appears for user when trying open data frame post pandas install without restart:
Link(s) to TestRail test cases run or created: N/A
Positron Version:
Positron Version: 2024.06.1 build 27 Code - OSS Version: 1.90.0 Commit: a893e5b282612ccb2200102957ac38d3c14e5196 Date: 2024-06-26T01:22:29.024Z Electron: 29.4.0 Chromium: 122.0.6261.156 Node.js: 20.9.0 V8: 12.2.281.27-electron.0 OS: Linux x64 6.5.0-41-generic
Steps to reproduce the issue:
EDIT: 2024-07-18 by @jthomasmock
The
df
object should appear under the "Data" sectiondf
to view the data.What did you expect to happen?
A data explorer tab should open.
Were there any error messages in the output or Developer Tools console?
Not in the output.