Open MichaelTiemannOSC opened 2 years ago
Answering my own question: need to change YEAR series to pint[dimensionless] and then write co2_df.pint.dequantify() to get a SQL table that works.
I have updated the demo notebook to show the hoops through which one must jump in order to unpack the results that come back from pd.read_sql. The relevant snippet is:
new_co2_df = pd.DataFrame({col:series.astype(f"pint[{unit.replace('co2', 'CO2')}]")
for col, unit, series in zip(list(map(lambda x: ast.literal_eval(x)[0], new_df.T.apply(lambda x: x.index, axis=1).index.values)),
list(map(lambda x: ast.literal_eval(x)[1], new_df.T.apply(lambda x: x.index, axis=1).index.values)),
[v for v in new_df.to_dict(orient='series').values()])})
Re-opening in case there's an obviously better way to do this.
I have updated the notebook to do a full round-trip of a dataframe that includes both non-quantified data (strings), as well as with both homogeneous units in a column as well as heterogeneous units in a column.
What is the recommended way to write and read pint-pandas dataframes to and from SQL databases?
Is it immediately obvious that the answer should be to convert to/from JSON? In my case, I'm interested in writing to/from a TRINO database via SQLAlchemy, but happy to see answers that work for PostgreSQL, SQLite, etc. Here's a test case:
https://github.com/os-climate/data-platform-demo/blob/master/notebooks/pint-demo.ipynb
Relevant snippet: