kaiko-ai / typedspark

Column-wise type annotations for pyspark DataFrames
Apache License 2.0
65 stars 4 forks source link

Add a monkeypatch to allow for df.to_typedspark method #224

Closed nanne-aben closed 11 months ago

nanne-aben commented 11 months ago

Allows for the following notation:

pivot, Pivot = (
    vaccinations.groupby(Vaccinations.pet_id)
    .pivot(Vaccinations.vaccine_name.str)
    .agg(first(Vaccinations.next_due_date))
    .to_typedspark()
)

Instead of (or in addition to):

pivot = (
    vaccinations.groupby(Vaccinations.pet_id)
    .pivot(Vaccinations.vaccine_name.str)
    .agg(first(Vaccinations.next_due_date))
)

pivot, Pivot = create_schema(pivot)
pivot.show()