Closed nathanjmcdougall closed 3 months ago
For pins in the short/medium-term future, I would be more keen to have pandas
built-in and expand the ability to use polars/any other dataframe library as desired. I do believe pins
should have at least one reasonable library included so users can read data, perhaps pinned by a colleague or from R, as a data frame without having to make a decision on what type of df that is/seeing errors if there is no dataframe library installed.
I could see a world where the default library is polars instead of pandas, but I do think pandas still has the masses for now.
Sounds good to me.
This is very similar to the discussion in #233 about making
pins
DF-library agnostic.I'm in two minds about this.
On the one hand, the vast majority of the time, anyone who wants to use
pins
will be usingpandas
. On the other hand, that means they would already have it installed, making it unlikely thatpandas
being optional would cause major friction.Making it optional would enable
polars
users etc. to usepins
without needing to installpandas
(see #153).It is being considered to add
pyarrow
as a required dependency to pandas which would increase the installation size by ~120MB. https://github.com/pandas-dev/pandas/issues/54466The costs to this project would be additional code complexity to protect import statements with
try-except
, as well as potentially some internal refactoring (e.g.as_df
options).