pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
28.32k stars 1.77k forks source link

compatible with pycharm #2598

Closed ylyang closed 6 months ago

ylyang commented 2 years ago

In interactive development by pycharm or spyder or other IDEs, we usually need to view the dataframes from time to time

polars dataframe is not compatible with pycharm, we need do pl.to_pandas() to convert pl.DataFrame to pandas.DataFrame to view the data

if we can support the scientific view of pycharm to view pl.dataframe or pl.series directly?

1

ritchie46 commented 2 years ago

Do you know what is required to do so? Don't they internally chech the objects for bring one of pandas?

If si, I think the issue should be opened at the IDE'd

ghuls commented 2 years ago

There is an issue: https://youtrack.jetbrains.com/issue/PY-50861

alexander-beedie commented 1 year ago

Looks like PyCharm literally just checks that the classname is "DataFrame", then assumes it must be pandas, and issues pandas-specific calls thereafter to get the schema/cols/etc.

This is why a polars DataFrame gets recognised, but then cannot actually display (raises AttributeError: 'DataFrame' object has no attribute 'axes'). It also fails to recognise a real pandas frame if you subclass it ;)

import pandas as pd

class AlsoAPandasFrame( pd.DataFrame ):
    """I am ALSO a pandas DataFrame"""

# this will now fail to display in PyCharm's scientific grid/view.    
df = AlsoAPandasFrame( {"x":[1,2,3]} )

We could shim the necessary functions on our side, but the problem is doing so without actually exposing them (eg: we don't really want axes or iloc attrs/methods, etc). Ideally they would have a generic mechanism for adding support, because what they are doing is not complicated (I found the source code that handles the grid display and took a look; it's handled by dataframe_to_thrift_struct in their pydev helpers plugin/lib).

ritchie46 commented 1 year ago

Looks like PyCharm literally just checks that the classname is "DataFrame", then assumes it must be pandas, and issues pandas-specific calls thereafter to get the schema/cols/etc.

Wow. :smiling_face_with_tear:

ritchie46 commented 1 year ago

(I found the source code that handles the grid display and took a look; it's handled by dataframe_to_thrift_struct in their pydev helpers plugin/lib).

Do they accept PRs? That would be great. :D

alexander-beedie commented 1 year ago

Do they accept PRs? That would be great. :D

I'll try and find out - I may know a guy that knows the devs 🤣

sm-Fifteen commented 9 months ago

This was one of their big advertized features in 2023.2 (they have definitely taken notice of Polars), and dataframes now display properly in their Jupyter frontend, but don't seem to work in the SciView dataframe viewer yet.

stinodego commented 6 months ago

This now works correctly.