Kanaries / pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
https://kanaries.net/pygwalker
Apache License 2.0
11.08k stars 570 forks source link

[BUG] PyGWalker with Quarto #595

Closed carecodeconnect closed 1 month ago

carecodeconnect commented 1 month ago

Describe the bug PyGWalker does not render in a Quarto presentation.

To Reproduce Steps to reproduce the behavior:

This code is within a Quarto .qmd file which I am running inside VS Code:

walker = pyg.walk(
    purchase_data,
    spec="./pygwalker-spec.json", 
    kernel_computation=True,
)

Expected behavior The GraphicWalker GUI should render in the Quarto presentation when rendered as .html.

It renders fine inside the VS Code Interactive viewer.

But when the Quarto document is rendered as a presentation.html file, after running quarto preview /mnt/data/projects/car-sales/notebooks/presentation.qmd --no-browser --no-watch-inputs, I get these errors:

Watching files for changes
  /@jupyter-widgets/html-manager/dist/1551f4f60c37af51121f.woff2 (404: Not Found)
  /@jupyter-widgets/html-manager/dist/eeccf4f66002c6f2ba24.woff (404: Not Found)
  /@jupyter-widgets/html-manager/dist/be9ee23c0c6390141475.ttf (404: Not Found)
  /ipylab.js (404: Not Found)
  /@jupyter-widgets/1551f4f60c37af51121f.woff2 (404: Not Found)
  /@jupyter-widgets/eeccf4f66002c6f2ba24.woff (404: Not Found)
  /@jupyter-widgets/be9ee23c0c6390141475.ttf (404: Not Found)

Screenshots

Screenshot from 2024-07-27 11-33-27

Versions

Additional context

I have a feeling I'm missing a JupyterLab widget for PyGWalker. If so, how do I install it?

JupyterLab v4.2.4
[/home/solaris/miniconda3/envs/car-sales/share/jupyter/labextensions](https://file+.vscode-resource.vscode-cdn.net/home/solaris/miniconda3/envs/car-sales/share/jupyter/labextensions)
        jupyterlab-plotly v5.23.0 enabled  X
        ipylab v1.0.0 enabled OK (python, ipylab)
        jupyterlab_pygments v0.3.0 enabled OK (python, jupyterlab_pygments)
        jupyter-leaflet v0.19.2 enabled OK
        @jupyter-notebook/lab-extension v7.2.1 enabled OK
        @pyviz/jupyterlab_pyviz v3.0.2 enabled OK
        @bokeh/jupyter_bokeh v4.0.5 enabled OK (python, jupyter_bokeh)
        @jupyter-widgets/jupyterlab-manager v5.0.11 enabled OK (python, jupyterlab_widgets)

   The following extensions may be outdated or specify dependencies that are incompatible with the current version of jupyterlab:
        jupyterlab-plotly

   If you are a user, check if an update is available for these packages.
   If you are a developer, re-run with `--verbose` flag for more details.
longxiaofei commented 1 month ago

Hi @carecodeconnect , thanks for your feedback.

I will try to reproduce this bug in quarto and find out the cause.

carecodeconnect commented 1 month ago

Thank you! I thought it might also be due to creating a Quarto presentation, rather than a document. But when I previewed the document, it hangs on Loading Graphic-Walker UI... and the terminal returns:

Watching files for changes
  /ipylab.js (404: Not Found)
longxiaofei commented 1 month ago

Hi @carecodeconnect

This is because pygwalker can't use Jupyter's communication module in Quarto HTML. Currently, all data needs to be rendered to the frontend (with calculations done by JavaScript).

This is a temporary solution code.

# Pygwalker in Quarto

```{python}
import pandas as pd
import pygwalker as pyg

df = pd.read_csv("xxx")

walker = pyg.walk(df, kernel_computation=False, env="JupyterConvert")


In next version, pygwalker will try to detect Quarto environment and automatically switch to JupyterConvert mode.
carecodeconnect commented 1 month ago

Thank you for your help! I tried this revised code to display a map of my pandas DataFrame of 2 million rows in both a Quarto .qmd document and a Jupyter notebook. I'm attempting to plot the lat and lng coordinates.

Unfortunately, with kernel_computation=False, the execution is very slow, and it takes too long (I waited several minutes and had to stop it). The previous code I posted plots the geospatial data fine in a Jupyter notebook, I guess because it is using duckdb in the backend. I've ran into the same problem using leaflet or folium with Python or R to plot this map.

I wonder if the root cause is the 2 million row DataFrame? I tried datashader, hvplot, holoviews, bokeh for the map, which works, but is not nearly as nice as leaflet and is not interactive. I think the root cause might be how the points are plotted using leaflet/folium without duckdb. (Using a for loop to plot the points is too computationally demanding). Maybe there is a way to use duckdb and render the map using leaflet/folium in R or Python, to just display the interactive map (for now, without the PyGWalker UI).

It's frustrating because the map works fine with PyGWalker inside the Jupyter notebook, but my presentation/report is written in Quarto.

longxiaofei commented 1 month ago

pygwalker has updated a experimental feature: component api. I’m not sure if it can solve your problem.

pygwalker version: pygwalker==0.4.9.4a2

example:

353770698-0b83b2aa-23da-4df3-87aa-75a54523a4a9

If you need a UI that can be interactively explored(base on the duckdb) in Quarto HTML, currently pygwalker cannot support this feature.

But I will consider using wasm duckdb to implement this feature in the future.