Closed beckernick closed 1 month ago
strip_chars (strip in pandas) is a common operation used during data cleaning.
strip
import polars as pl from functools import partial from cudf_polars.callback import execute_with_cudf use_cudf = partial(execute_with_cudf, raise_on_fail=True) # for testing df = pl.LazyFrame({"foo": [" hello", "\nworld"]}) print(df.with_columns(foo_stripped=pl.col("foo").str.strip_chars()).collect()) print(df.with_columns(foo_stripped=pl.col("foo").str.strip_chars()).collect(post_opt_callback=use_cudf)) shape: (2, 2) ┌────────┬──────────────┐ │ foo ┆ foo_stripped │ │ --- ┆ --- │ │ str ┆ str │ ╞════════╪══════════════╡ │ hello ┆ hello │ │ ┆ world │ │ world ┆ │ └────────┴──────────────┘ --------------------------------------------------------------------------- ComputeError Traceback (most recent call last) Cell In[38], line 10 7 df = pl.LazyFrame({"foo": [" hello", "\nworld"]}) 9 print(df.with_columns(foo_stripped=pl.col("foo").str.strip_chars()).collect()) ---> 10 print(df.with_columns(foo_stripped=pl.col("foo").str.strip_chars()).collect(post_opt_callback=use_cudf)) File [/raid/nicholasb/miniconda3/envs/all_cuda-122_arch-x86_64/lib/python3.11/site-packages/polars/lazyframe/frame.py:1942](http://10.117.23.184:8882/lab/tree/raid/nicholasb/raid/nicholasb/miniconda3/envs/all_cuda-122_arch-x86_64/lib/python3.11/site-packages/polars/lazyframe/frame.py#line=1941), in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, cluster_with_columns, no_optimization, streaming, background, _eager, **_kwargs) 1939 # Only for testing purposes atm. 1940 callback = _kwargs.get("post_opt_callback") -> 1942 return wrap_df(ldf.collect(callback)) ComputeError: 'cuda' conversion failed: NotImplementedError: String function StringFunction.StripChars
Done in #16504
strip_chars (
strip
in pandas) is a common operation used during data cleaning.