pwwang / datar

A Grammar of Data Manipulation in python
https://pwwang.github.io/datar/
MIT License
262 stars 17 forks source link

`TibbleGrouped` object is not expandable in VSCode jupyter data viewer #79

Open rleyvasal opened 2 years ago

rleyvasal commented 2 years ago

When I create grouped data with datar's group_by(), I get an undesirable DataFrameGroupBy element instead of a DataFrame. It is not desirable to have a DataFrameGroupBy in VSCode because the dataframe cannot be clicked on the Variables Window of VSCode to see the entire dataframe, whereas the mtcars can be click to exposed the full dataset because it is a DataFrame.

The code below creates grouped data in datar and grouped data in pandas; However, datar creates a DataFrameGroupBy instead of a dataframe.

from datar.all import *
from datar.datasets import mtcars
datar_group = mtcars >> group_by(f.hp) >> count()
pandas_group = mtcars.groupby('hp').size().reset_index().rename(columns = {0:"n"})

datar_group

pwwang commented 2 years ago

Thanks for reporting. That's a problem with the count() function.

Working on v0.6 to address all performance issues, and also the issue with the count function. After its release, would you get back to this issue to see if the issue persists?

rleyvasal commented 2 years ago

sure, thanks for all your work on this project!

pwwang commented 2 years ago

Here is the behavior with 0.6.0:

image

image

count() is now maintaining the group structure, which is the desired behavior. See how it acts in R:

image

Feel free to try it out and let me know if there is any issues.

rleyvasal commented 2 years ago

Hi @pwwang,

Maybe I did not specify the functionality that I was expecting.

I was expecting to get a dataframe output after grouping data with group_by so that I could click on the icon on the left of the variable name in VSCode Variables(square with arrow pointing up) and open the dataframe in Data Viewer to explore the data( second Picture)

This is related with Data Viewer of dataframes and not related with the grouping functionality itself. Not sure if VSCode would add TibbleGroupd to Data Viewer to be able to see the data. Feel free to close this if you'd like.

Note: Variables window for pandas_group shows it is a Dataframe size 22,2 but datar_group shows is a TibbleGrouped size 22 (picture below)

datar_group_tibble

pandas_dat_group_on_Data_Viewer

pwwang commented 2 years ago

TibbleGrouped is a subclass of DataFrame, so you should be able to view it as a data frame in the viewer. That's why you have the data shown in your figure 2. Isn't that what you expected?

rleyvasal commented 2 years ago

figure 2 shows dataset grouped with pandas groupby() after click on the icon (square with arrow pointing up) on left of pandas_group variable name in Variables window.

datar_group grouped with datar's group_by() does not have the icon(square with arrow pointing up) next to it; therefore it cannot be clicked and viewed in Data Viewer window. pandas_dat_group_on_Data_Viewer

pwwang commented 2 years ago

I see. Will give it a further investigation.

pwwang commented 2 years ago

I believe it's a problem with the data viewer. It's not recognizing an object of DataFrame subclass:

image

pwwang commented 2 years ago

An issue is submitted:

https://github.com/microsoft/vscode-jupyter/issues/9264

GitHunter0 commented 3 weeks ago

An issue is submitted:

microsoft/vscode-jupyter#9264

They deemed the issue solved, is it right?

pwwang commented 3 weeks ago

I don't think so. It's locked by the bot due to inactivity.

GitHunter0 commented 3 weeks ago

@pwwang , I just tested and it indeed persists, therefore I opened a new issue https://github.com/microsoft/vscode-jupyter/issues/15750