catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 106 forks source link

Apply correct data types to columns in output objects #453

Closed zaneselvans closed 2 years ago

zaneselvans commented 4 years ago

Currently pandas tries to infer data types instead of using the data types stored in the database during read_sql() -- in the output routines, we need to ensure that we enforce the correct data types from our data type dictionary. This can be done either in the read_sql() directly, or after the fact See pudl.extract.eia860.get_eia860_page() for an example.

zaneselvans commented 4 years ago

It turns out this is a longstanding issue awaiting a PR since 2014. See: https://github.com/pandas-dev/pandas/issues/6798

zaneselvans commented 2 years ago

Closing as a dupe with #818