abstractqqq / polars_ds_extension

Polars extension for general data science use cases
MIT License
348 stars 23 forks source link

Polars 20.6 breaks strings module - but requirement map to latest polars version #62

Closed furlat closed 8 months ago

furlat commented 8 months ago

When calling the notebook

df2.select(
    pl.col("sen").str.to_lowercase().str2.tokenize(stem=True).explode().unique()
)
thread '<unnamed>' panicked at crates/polars-lazy/src/physical_plan/exotic.rs:39:65:
called `Result::unwrap()` on an `Err` value: ComputeError(ErrString("cannot call plugin\n\nThis Polars' version has a different 'binary/string' layout. Please compile with latest 'pyo3-polars'\n\nError originated just after this operation:\nDF [\"\"]; PROJECT */1 COLUMNS; SELECTION: \"None\""))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
furlat commented 8 months ago

i fixed forcing in the pyproject.toml

dependencies = [    
    "polars == 0.20.5",
] 

Also num.py was not registering some methods throwing error in the notebook that have forward type hints using np. without importing numpy. I solved by adding import numpy as np, and recompiled the wheel, I can run the full notebook (also from the graph branch) - will test the graphs methods soon

abstractqqq commented 8 months ago

I don't think I am managing my branches in the best way. I am in the middle of a big refactor right now that requires >= 0.20.6. Rust polars had a breaking update lately

furlat commented 8 months ago

Should I wait to test out the graphs method on the next refactor? Or is it useful if I play around the current more_graph branch today?

abstractqqq commented 8 months ago

Better wait I think. A lot of exciting stuff coming up. E.g shortest path with unequal distance cost per edge, parallel execution of graph queries, etc. Meanwhile I really appreciate you reading the source code like you are already doing. So keep the questions coming. Thank you

abstractqqq commented 8 months ago

Fyi, current main is the release candidate. I want to wait for a few days before release in case I want to make any additional changes or I find any other bugs. But all functionality should be more or less final.