narwhals-dev / narwhals

Lightweight and extensible compatibility layer between dataframe libraries!
https://narwhals-dev.github.io/narwhals/
MIT License
607 stars 90 forks source link

perf: concatenate less in pandas-like with_columns #1361

Closed MarcoGorelli closed 1 week ago

MarcoGorelli commented 1 week ago

It was mentioned here that they're seeing fragmentation warnings from pandas

I think we can address this by only concatenating less in with_columns

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below.

MarcoGorelli commented 1 week ago

🤔 trying this out here https://www.kaggle.com/code/marcogorelli/visualise-timings?scriptVersionId=207037156 it looks like it doesn't help, and maybe even hurts?

i think we need to dig deeper before merging

MarcoGorelli commented 1 week ago

looks like the warning is actually coming from

            for s in new_columns:
                df[s.name] = validate_dataframe_comparand(index, s)

I think we can address that