narwhals-dev / narwhals

Lightweight and extensible compatibility layer between Polars, pandas, cuDF, Modin, and more!
https://narwhals-dev.github.io/narwhals/
MIT License
218 stars 31 forks source link

Extend `narwhalify` decorator #320

Closed MarcoGorelli closed 1 day ago

MarcoGorelli commented 1 week ago

As discussed in this PR https://github.com/narwhals-dev/narwhals/pull/266, we should probably extend narwhalify with extra capabilities

Maybe - only maybe - we should aim for it to be the primary way that people interact with Narwhals? Not totally sure on this one

FBruzzesi commented 1 week ago

It may be worth discussing a few strategy that we can have. I see the following:

  1. (Try to) convert every argument to narwhals with strict=False flag (both in input and output)
  2. Allow user to specify which arguments and output to convert to and back from narwhals.
    • Leaves more freedom to the user, if some very custom operations need to be performed.

3. Use type checks to convert dataframe/series only objects Realized this would mostly replicate what's done internally by 1.

I personally prefer option 1., it is possibly opinionated but also a very clear promise that everything that can be converted, will be converted. For very custom cases, the user has always the option to deal with those operation first, and then come to use narwhals.

Extra: in each case we should check that the dataframe objects come with the same backend

WDYT?

FBruzzesi commented 1 week ago

WIP attempt for 1.

Edit: I updated the series docstrings as well. This ends up being a bit much in the diff, but some methods in particular show the power of the different approach. (I am especially looking at the zip_with example)

MarcoGorelli commented 1 week ago

nice one!

I personally prefer option 1., it is possibly opinionated but also a very clear promise that everything that can be converted, will be converted

Yeah, I'm inclined to agree. And that's still backwards-compatible

We should measure the overhead, but my intuition is that it should be fairly low