narwhals-dev / narwhals

Lightweight and extensible compatibility layer between dataframe libraries!
https://narwhals-dev.github.io/narwhals/
MIT License
613 stars 91 forks source link

fix: address `lit` broadcasting and output name of right arithmetic ops #1424

Open AlessandroMiola opened 4 days ago

AlessandroMiola commented 4 days ago

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below

Opening as draft to get comments/guidance on the approach. Also, (1) Dask behaviour, (2) column name resulting from right arithmetics on Series in Polars, (3) leftover binary dunder methods need more thourough exploration on my side.

Another point, this PR should solve both of the issues in principle. I combined the two fixes because I originally thought that the second (#509) could naturally be resolved as a consequence of the first, but at the end that's not really the case (at least via the approach I followed). Perhaps I should separate the two?

Key points:

853:

- Tweak `validate_column_comparand` in `_pandas_like` so as to reindex `other` via the lhs series only when possible (which is what the error message was complaining about when outputting _ValueError: Length mismatch: Expected axis has 3 elements, new values have 1 elements_)
- Broadcast the lhs series via `maybe_broadcast_scalar_into_series` when relevant (i.e. when of different length wrt the previously validated/broadcasted rhs `other`)

509:

- Add optional `alias` parameter to `reuse_series_implementation` and assign it to `expr._output_names` (basically, the approach Marco was showing in last week's livestream) and pass it all along to be able to rename the output of rarithmetic ops as `"literal"` without incurring into _safety assertion_ errors.

Points that need - for sure - further exploration: