holoviz / panel

Panel: The powerful data exploration & web app framework for Python
https://panel.holoviz.org
BSD 3-Clause "New" or "Revised" License
4.8k stars 519 forks source link

Polars: ValueError("Could not hash object of type function"). #7467

Closed MarcSkovMadsen closed 2 weeks ago

MarcSkovMadsen commented 2 weeks ago

I'm on panel==1.5.3 trying to support multiple types of data sources for panel-graphic-walker including polars. See https://github.com/panel-extensions/panel-graphic-walker/pull/22.

When I try to combine polars and pn.cache I get a ValueError.

import polars as pl
import panel as pn

@pn.cache(max_items=20, ttl=60 * 5, policy="LRU")
def my_func(value):
    return pl.DataFrame({"a": [3, 2, 1]})

value = pl.DataFrame({"a": [1, 2, 3]})
my_func(value)
  File "/home/jovyan/repos/private/panel-graphic-walker/.venv/lib/python3.11/site-packages/panel/io/cache.py", line 235, in _generate_hash
    hash_value = _generate_hash_inner(obj)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jovyan/repos/private/panel-graphic-walker/.venv/lib/python3.11/site-packages/panel/io/cache.py", line 212, in _generate_hash_inner
    raise ValueError(
ValueError: User hash function <function _container_hash at 0x7f7ea007e520> failed for input (shape: (3, 1)

Please support caching polars similarly to pandas.

Additional Context

Seem that Polars has some hashing support these days. See https://docs.pola.rs/api/python/stable/search.html?q=hash.

ChatGPT says you can hash

import polars as pl
import hashlib

# Sample Polars DataFrame
df = pl.DataFrame({
    "column1": [1, 2, 3],
    "column2": ["a", "b", "c"]
})

# Convert DataFrame to JSON format and encode it to bytes
df_bytes = df.write_json().encode('utf-8')

# Generate a hash key using SHA-256
hash_key = hashlib.sha256(df_bytes).hexdigest()

print(hash_key)
hoxbro commented 2 weeks ago

We should do something similar to what we do with pandas.

ChatGPT suggestions seem to scale badly with both a JSON conversion and a sha256 sum of that string.