Open endremborza opened 1 year ago
FYI (unrelated to the sorted
flag observation, but useful knowledge) you can init the example frame 10x faster by dropping the leading iter
, as there is a fast-path for range
;)
pl.DataFrame({"A":range(10_000_000)})
I know, but that adds the sorted flag automatically, and I wanted to do a comparison without it
another thing I just wanted to make a quick note of: a seemingly more complicated but still very useful optimization would be on multi-level joins / sorts:
df = pl.DataFrame({"A": [1,2] * 500_000, "B": range(1_000_000)}).sample(999_000)
dfs = df.sort("A", "B")
df2 = pl.DataFrame({"A": [1, 2, 3, 4] * 50, "B": range(200)})
df.join(df2, on=["A", "B"]) # 746 µs ± 23.5 µs per loop
dfs.join(df2, on=["A", "B"]) # 762 µs ± 18.1 µs per loop
Problem description
Hello.
I'm new to polars but so far I like it quite a bit. I'm also open to contributing a solution to this problem if I choose to stick with polars, but first I thought I'd just lay it out simply:
it seems that there is some sort of optimization going on but clearly not close to its full potential, as I don't see a clear reason why option two should not have the same computation as option 3.
I'm also new to rust, so doing this would require significant effort for me, but if someone points me to some parts of the code, I could give it a look.