Sometimes you need to resort to manual Python looping, eg. when integrating with legacy code.
It seems to be very slow to loop through nested Polars series:
import polars as pl
import random
import string
random.seed(42)
s = pl.Series([random.sample(string.ascii_letters, random.randint(3, 10)) for _ in range(10_000)])
s_list = s.to_list()
s_np = s.to_numpy()
s_flat = s.explode()
def loop(x):
for l in x:
for _ in l:
pass
def loop_flat(x):
for _ in x:
pass
# Baseline with Polars
In [13]: %timeit -n 3 -r 3 loop(s)
138 ms ± 217 µs per loop (mean ± std. dev. of 3 runs, 3 loops each)
# Looping flat Polars series is a bit faster
In [18]: %timeit -n 3 -r 3 loop_flat(s_flat)
89.8 ms ± 208 µs per loop (mean ± std. dev. of 3 runs, 3 loops each)
# Looping nested Python lists is 100x faster
In [15]: %timeit -n 3 -r 3 loop(s_list)
1.02 ms ± 32.2 µs per loop (mean ± std. dev. of 3 runs, 3 loops each)
# When including the to_list(), still 20x faster
In [20]: %timeit -n 3 -r 3 loop(s.to_list())
5.45 ms ± 239 µs per loop (mean ± std. dev. of 3 runs, 3 loops each)
# Looping nested NumPy arrays is 20x faster
In [17]: %timeit -n 3 -r 3 loop(s_np)
7.62 ms ± 50.4 µs per loop (mean ± std. dev. of 3 runs, 3 loops each)
# When including to_numpy(), still 10x faster
In [21]: %timeit -n 3 -r 3 loop(s.to_numpy())
10 ms ± 120 µs per loop (mean ± std. dev. of 3 runs, 3 loops each)
Problem description
Sometimes you need to resort to manual Python looping, eg. when integrating with legacy code.
It seems to be very slow to loop through nested Polars series:
Profile: