JakobGM / patito

A data modelling layer built on top of polars and pydantic
MIT License
270 stars 23 forks source link

[FEAT] Allow creation of Model.DataFrame from polars.DataFrame/LazyFrame #27

Open jjfantini opened 11 months ago

jjfantini commented 11 months ago

Right now, you can only create a model-aware DataFrame from either a dict representation or from pandas. It would be great for workflow to be able to do this with polars objects.

from openbb_terminal.stocks import stocks_helper as stocks
import polars as pl

df = stocks.load(
    symbol="AAPL",
    start_date="1950-01-01",
    end_date="2023-10-16",
) 

df = pl.from_pandas(df, include_index=True).lazy()

# THIS DOESNT WORK
df = StocksBaseModel.LazyFrame(df)

TypeError: DataFrame constructor called with unsupported type 'LazyFrame' for the `data` parameter
Fizzizist commented 10 months ago

@jjfantini I agree that this should be added, but in the meantime, wouldn't an instantiation using pyarrow be better because the data is already stored as Arrow arrays on the backend?

StockBaseModel.DataFrame._from_arrow(df.to_arrow())
jjfantini commented 10 months ago

Yup, that is what I currently do, although I would suggest df.to_dict(). It is a smidge faster.

Although I have stopped using patito since my project requires Pydantic v2. I have published a package humblpatito that is compatible with Pydantic V2, but don't have time right now to flesh out all the edge cases. Plus, I want to use AliasChoices in V2