Open elephaint opened 13 hours ago
Hey @elephaint , thanks for your request. This can certainly be a pain point for other libraries trying to adopt narwhals.
I would say that the answer is it depends.
We have a set of functionalities, namely maybe_align_index
, maybe_get_index
, maybe_set_index
, maybe_reset_index
and maybe_convert_dtypes
, which are meant to help working with pandas objects without having to manually check all the times.
If that's not enough, nw.dependencies.is_pandas_dataframe(df_nw.to_native()))
is an option, another one would be df_nw._compliant_frame._implementation is Implementation.PANDAS
but it doesn't look less convoluted to me.
In plotly express, I had to do something similar, by adding a flag is_pd_like
on the first encounter of the dataframe object, and passing that to various functions to branch out the logic.
Thanks for the request!
I think currently the two documented way would be:
if nw.get_native_namespace(df) is nw.dependencies.get_pandas()
if nw.dependencies.is_pandas_dataframe(df.to_native())
I can see that it would be convenient to have something more ergonomic... 🤔 will think about this one. Thanks for having highlighted this
another one would be df_nw._compliant_frame._implementation is Implementation.PANDAS but it doesn't look less convoluted to me.
wait, this would be highly risky as it involves using private methods which may change at any time 😉 Better to stick with the public API, which we make some stability guarantees about
Thanks for the discussion!
I think for now I'll go with nw.dependencies.is_pandas_dataframe(df.to_native())
Just to be clear - this is really a 'nice to have' but by no means very important to me, so don't make something crazy complex over this 😛
Just out of curiosity for now, could you point to such example that requires branching a specific path for pandas?
Currently I often have the following code:
What is difficult about this, is that I need to keep track of
is_pandas
variables throughout the code, send them in subfunctions, etc. If I have multiple DataFrames, I have multiple suchis_pandas
variables. Ideally, I'd be able to do something such as:i.e., having whether the underlying dataframe is pandas or not simply as a boolean attribute of the Narwhals DataFrame. That would allow me to use
df_nw
everywhere without requiring the auxiliary variables everywhere or first converting to native.Of course, I know I can also do this everywhere:
nw.dependencies.is_pandas_dataframe(df_nw.to_native()))
but that feels convoluted.What is the cleanest way to do this?