Closed stinodego closed 7 months ago
@MarcoGorelli could you pherhaps do a pass of the pandas queries (might be a copy/past from narwhals)? I think that pre-computing the group-by aggregations elementwise is okish (with a comment), but filters and projections should definitely be left to optimizer.
Thanks for the ping, I'll run these to see. Ideally, "pandas" vs "pandas via narwhals" should be very close
For q1, that's now the case, the query looks fine
For q2, they're way off, and looking at the code, the pandas query is filtering before joining. I'll update this and the others where there's a perf difference
There are some hand optimizations that should not be in there (e.g. filters before joins).