akgold / do4ds

A book on DevOps for Data Scientists with CRC Press.
https://do4ds.com
Other
130 stars 27 forks source link

Lab 2 - Add penguins dataframe to duckdb #253

Open durraniu opened 3 months ago

durraniu commented 3 months ago

There's this code to get penguins data from duckdb in lab 2:

import duckdb
con = duckdb.connect('my-db.duckdb')
df = con.execute("SELECT * FROM penguins").fetchdf().dropna()
con.close()

But penguins data does not exist as a table in duckdb yet. I suggest adding this after creating con:

from palmerpenguins import penguins
penguins_df = penguins.load_penguins().dropna()
con.sql("CREATE TABLE penguins AS SELECT * FROM penguins_df")