Closed EdAbati closed 1 week ago
Thank you both for the comments! :)
a. Passing a list
into lit
seems to work.
import polars as pl
import pandas as pd
import narwhals as nw
df = pd.DataFrame({"a": [0, 1]})
df_pl = pl.DataFrame(df)
@nw.narwhalify
def func(df):
return df.with_columns(c=nw.lit([1, 2]))
>>> func(df_pl)
shape: (2, 2)
βββββββ¬ββββββββββββ
β a β c β
β --- β --- β
β i64 β list[i64] β
βββββββͺββββββββββββ‘
β 0 β [1, 2] β
β 1 β [1, 2] β
βββββββ΄ββββββββββββ
>>> func(df)
a c
0 0 [1, 2]
1 1 [1, 2]
The only "issue" is that we cannot specify the correct DType
(i.e. nw.lit([1, 2], dtype=...)
just yet, because we don't have a list dtype. Or am I missing something?
b. Passing a np.array
makes pandas and polars behave differently at the moment. π€
@nw.narwhalify
def func(df):
import numpy as np
return df.with_columns(c=nw.lit(np.array([1, 2])))
>>> func(df_pl)
shape: (2, 2)
βββββββ¬ββββββ
β a β c β
β --- β --- β
β i64 β i64 β
βββββββͺββββββ‘
β 0 β 1 β
β 1 β 2 β
βββββββ΄ββββββ
>>> func(df)
a c
0 0 [1, 2]
1 1 [1, 2]
thanks for checking that carefully - I think we should disallow passing numpy arrays to lit
? at least for now - if we just raise an error for now, we can always loosen the behaviour later if we want to, whilst preserving backwards compatibility π
just yet, because we don't have a list dtype. Or am I missing something?
you're right, we don't :) largely because potential consumers of Narwhals probably wouldn't be using it, given that in pandas pre-pyarrow it would just be object
dtype ... but we should implement it properly in Narwhals eventually, it's definitely worthwhile
how did you go about figuring this all out? did you read the "how it works" page? did you use a debugger / read the code very carefully?
Kind of both. I read the "how it work works" page out of curiosity, when I found out about the project. To contribute to this I read the code, searching for similar implementations and used the debugger.
I found the code easy to follow, also thanks to the typing. So π π to you and the rest of the team
The contributing guide and process (with the template issues and PRs) is also nice!
Maybe we can add some documentation about the structure of the unit tests. I am used to find the same module/submodule structure as in the library. Here it seems the tests cases in the same submodule are in more scripts. I was a bit unsure where to add mine π
Regarding variable names, the only thing I noticed is the one I mentioned above. I was unsure why we needed to use PandasSeries
, before realising that series_from_iterables
was returning a βnative seriesβ.
Maybe we can add some documentation about the structure of the unit tests. I am used to find the same module/submodule structure as in the library. Here it seems the tests cases in the same submodule are in more scripts. I was a bit unsure where to add mine π
most of the structure of the unit tests is quite odd. probably because I did it in a hurry as a weekend project. if you fancy improving the tests structure, that's very welcome - i've added you to the team so feel free to submit anything you think is an improvement and merge anything you think is ready
Thank you for the feedback and for adding me to the team! This is "lit" π₯ (sorry had to make the joke)
What type of PR is this? (check all applicable)
Related issues
Checklist
If you have comments or can explain your changes, please do so below.
allow_object
param thatpolars.lit
have. I'm not 100% sure on what it does and how to translate it in the pandas world. In which cases does polars use 'object'? π€PandasSeries.from_iterable
? I saw that there are multplePandasSeries(series_from_iterable(...), ...)
callslist
? Polars supports it but I haven't seen alist
dytpe in narwhals