abstractqqq / polars_ds_extension

Polars extension for general data science use cases
MIT License
261 stars 17 forks source link

Add string pre-preprocessing code #150

Closed CangyuanLi closed 1 month ago

CangyuanLi commented 1 month ago

This pull request adds a single function, replace_non_ascii, that replaces non-Ascii values with a specified value. Although I have more string pre-processing functions I am interested in contributing, since this is my first pull request, I wanted to keep it small :). It also adds a test_str2.py file and a benchmarks folder to generate benchmark data, run benchmarks, and plot / run analytics on the benchmarks.

Thanks for the great library and please let me know if this is a good / bad PR! Thanks a lot!