microsoft / vscode-jupyter

VS Code Jupyter extension
https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter
MIT License
1.29k stars 292 forks source link

DS: Provide rich debugger support for pandas DataFrames #1286

Closed apryor6 closed 3 years ago

apryor6 commented 5 years ago

I build lots of Flask APIs that internally use pandas DataFrames, and almost daily I find myself jumping into the debugger; however, there is not a particularly good way to inspect the contents of DataFrame. Ideally there would be a way to, perhaps, double-click or hover on a DataFrame variable and have a popup that shows the first few rows spreadsheet-style

The Spyder IDE has something like what I describe

Currently, I find I am limited to either:

I've felt for some time that this would be a great feature, but honestly I assumed there would be a large number of other data scientists sharing the same request and that it would magically appear. This is me conceding that point and doing what I should have already and making the request.

Edit: I see a teaser for such functionality from some time ago here, but I don't see this inside of the app. Am I missing something, or is there an update?

mgsnuno commented 5 years ago

I share the same need.

While hovering over a pandas DataFrame in debug, it would be nice to render the html view of the dataframe head, with horizontal/vertical scrolls.

IanMatthewHuff commented 5 years ago

@apryor6. In the interactive window experience we have a data viewer for pandas dataframes. Is this the type of viewing experience that you would like with a standard VSCode debugging experience?

https://devblogs.microsoft.com/python/python-in-visual-studio-code-april-2019-release/

mgsnuno commented 5 years ago

just giving my 2 cents: vscode variables explorer viewer or IPython.display.display are both good options. whatever makes the user experience more consistent.

apryor6 commented 5 years ago

@IanMatthewHuff yes precisely like this. The ideal user experience would be that when paused at a triggered breakpoint that an icon appears next to the variable name if it is a DataFrame that opens this viewer on click or that such a window appears on hover.

IanMatthewHuff commented 5 years ago

Got it. Thanks for the feedback @apryor6 and @mgsnuno .

kogakenji commented 5 years ago

I have the same need. Is there any workaround for this? What are you guys using when debugging a dataframe inside vscode?

mgsnuno commented 5 years ago

while debugging, I type in the Debug Console display(dataframe)

display comes from IPython.display.display and is loaded by default in a jupyter enviroment.

kogakenji commented 5 years ago

Ok. Thanks Nuno! I see. I am not using jupyter environment. I am using python debug inside vscode. Even after importing IPython, I get None when using IPython.display.display(DataFrame).

DonJayamanne commented 5 years ago

Playing with a few ideas...

dsExplorer

ejohb commented 5 years ago

I think this is the only thing still keeping me on PyCharm :)

szc11121 commented 4 years ago

Playing with a few ideas...

dsExplorer

could you please tell me how can I find the DATA SCIENCE VARIABLES in my debug side bar?

rchiodo commented 4 years ago

@szc11121 the ideas that @DonJayamanne was playing with have not shipped. That feature is not supported yet.

pyropenguin commented 4 years ago

Chiming in to voice support for this feature. I like the suggestion from @szc11121, although just having the "Show Variable in Data Viewer" icon directly in the VARIABLES sidebar on the right of each variable it applies to (and double-click the line to open data viewer) might be cleaner. Alternatively, if there were a separate list of Data Science Variables, perhaps make it look and function the same as the Python Interactive Variables table.

boazdori commented 4 years ago

I would like to add my support for this feature. I am working a lot with pandas DataFrames and having tables and variable description helps a lot in the debugging process. This format is available in spyder which came from the MatLab way of thinking, and Pycharm included a data science working setup which gives this possibility as well. vs code should have the same setup this is a must!

wangluochao902 commented 4 years ago

There is a way to work around with the help of debugging a cell in Jupyter notebook. While debugging, the dataframe is displayed in the Jupyter notebook. The limitation is you need to run it in the Jupyter notebook.

vscode-debugger-dataframe

el-analista commented 4 years ago

Hi, +1 for this as right now the debugger is printing end of line as \n instead of an end of line. Why doesn't it behave like the terminal?

Dr-Irv commented 4 years ago

while debugging, I type in the Debug Console display(dataframe)

display comes from IPython.display.display and is loaded by default in a jupyter enviroment.

No longer works with python extension version 2020.1.57204 . See microsoft/ptvsd#2036

joaohsr commented 4 years ago

Still waiting for a DataFrame Viewer in debug mode :(

christina-zhou-96 commented 4 years ago

I'm following this to learn when I can switch off PyCharm...

voochuk commented 4 years ago

Would be great if you supported a debug visualizer extension point like studio https://docs.microsoft.com/en-us/visualstudio/debugger/create-custom-visualizers-of-data?view=vs-2019 Some datascience structures like large xarray's, tensors etc that require introspection using filters, sorting etc can then be handled as well

fpnick commented 4 years ago

Same need here :)

pearlus commented 4 years ago

+1, lack of this feature forcing me to pycharm

Adding picture from pycharm sciview: image

maksudmck commented 4 years ago

I really really need this feature!

memeplex commented 4 years ago

Having rich view is one great thing, but NOT having a raw string with newlines emebeded and everything is one basic thing. Could you at least show the standard dataframe representation which is ok as far as it goes?

aleemkhusro commented 4 years ago

please add this. This would make my life so simple. The only thing that VS Code doesn't have, and PyCharm does have. Do this, and I'll be in remote debugging heaven. My boss will give me a raise, and I will cure cancer.

granthussey commented 4 years ago

This is a feature VSCode desperately needs

shireenrao commented 4 years ago

I updated vscode to the latest version, and now see that dataframes are printing nicely. Thank you for this. This is how it looks: Annotation 2020-03-10 133746

taha-yassine commented 4 years ago

I think extanding this feature to other data formats (e.g. numpy arrays) like in the already supported data viewer would be great.

rishabhrishu commented 4 years ago

Were you able to view all the rows? I get internals on clicking expand icon.

I updated vscode to the latest version, and now see that dataframes are printing nicely. Thank you for this. This is how it looks: Annotation 2020-03-10 133746

shireenrao commented 4 years ago

@rishabhrishu The behavior is consistent with jupyter notebooks where the number of rows and columns displayed are defaults. You can run the following in your debug screen to show all rows and columns

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

It's usually not needed as you can filter your dataframe to get to what you want to see.

jbsilva commented 4 years ago

+1 This is a must have feature

sverma333 commented 4 years ago

I have been waiting almost 6 months for this to ship for debug mode, Totally essential. @szc11121, your ideas above look great! Any ETA?

AAraKKe commented 4 years ago

I know one more might not make a big difference but I would totally appreciate this as well. I am trying to switch to vscode for several projects where I would like people to use it and this (togehter with not autofilling the self parameter in a class method) are one of the things I really miss when not using PyCharm.

vivekvs1 commented 4 years ago

I am upvoting this as well. Like the rest, I have been keeping tab on VS Code on when this feature would be available, to switch to VS Code. THat is literally the only thing that is keeping me from switching from Pycharm. Thanks. VC Code is taking beautiful shape in such short span thanks to you guys!

gradientAscent commented 4 years ago

+1 for this. Would be very helpful to have a simple rendering of a dataframe available during debugging. Thanks for all of the great work on VS code!

dioptx commented 4 years ago

+1 for this. This is the only feature that keeps me going back to PyCharm.

rivasd commented 4 years ago

+1 for this feature, it seems to me that it used to be better before as it would at least print out the string repr of the DataFrame, now all I get is some kind of abridge version with only 2 columns event though many more would fit....

nkkollaw commented 4 years ago

I think we got it, guys. We should stop posting the same "+1 for this" comments over and over again.

Just add a thumbs up, when people post a comment everyone gets a notifications (including mine, LOL—sure)!

Mpedrosab commented 4 years ago

Just as an idea, Spyder IDE has a nice DataFrame and Numpy viewer. Is it easy to replicate it in Visual Studio somehow? Sorry for my scarce support, but I am not an IDE developer...

Pineleaf commented 4 years ago

Is this feature actually coming? It is quite possibly the most annoying thing about the dev experience of vscode/python. Yes you can switch to debug console and set display max - but its not really a nice dev experience if you have to do this per dataframe - also you get column overflow and scrolling issues. The workflow of having the dataframe in a viewer possibly with a save to csv would be good.

joaohsr commented 4 years ago

Hello @luabud! Could this feature be developed through the Python extension developers team or maybe through the VS Code developer's team? I think this improvement could bring a huge quantity of data scientists and data engineers to VS Code fastly.

Thanks in advance, Joao Henrique

sverma333 commented 4 years ago

Come on guys, you are making it seem silly to recommend VS Code without this

nkkollaw commented 4 years ago

Come on guys, you are making seem silly to recommend VS Code without this

Spyder is many times better for data science, because of this feature is missing.

agrimaldi74 commented 4 years ago

As soon as this feature is available, I'll switch to vscode from PyCharm...

andrew-fcx commented 4 years ago

Using df.to_csv() in the debug console has been my workaround while I'm waiting for this feature to finally come to VS Code

aleemkhusro commented 4 years ago

It's not really a bad idea to keep posting a +1 shows that there is consistent interest in this essential essential feature for the data community. Having the ability to pause at a break point and open a rich debug console where you can start coding and seeing the dataframes is so important that it's lack is really a deal breaker for me. But I love vs code for everything else.

nkkollaw commented 4 years ago

Guys, how much would it cost to implement this functionality?

I would definitely be willing to chip in, just so I could use VSC 100% of the time instead of having to jump to Spyder for Python. I'm sure other people would do the same. We could create an entry in BountySource or something, or just do it unofficially.

I definitely have a lot less money than Microsoft @msftgits , but they don't seem interested in doing the same.

aleemkhusro commented 4 years ago

@nkkollaw I would donate money to MS if thats what it would take to implement this feature. It's been one year now since this feature was requested. I hope some people smarter than me implement this as an extension or something.

nkkollaw commented 4 years ago

@aleemkhusro, I don't think Microsoft needs more money, nor think they have to be the ones to implement it. It could be implemented as an addon to the Python plugin.

My thought was that we're all busy—and need money to survive. If we put together a decent amount, there could be somebody that could choose to take up this task instead of another random project, and it wouldn't be volunteering.

If everyone who added to this conversation put USD 100, we'd have a good amount to get started.

@greazer had this in DS: March 2020 Release (https://github.com/microsoft/vscode-python/milestone/61), so perhaps it's coming.

sverma333 commented 4 years ago

I have been looking out for this feature every month for almost a year, are there alternatives via extensions? Or what is the ETA for this becoming part of VSCode, @DonJayamanne already made loads of progress on this last october but seem to have gone cold now..