hgrecco / pint-pandas

Pandas support for pint
Other
172 stars 42 forks source link

Plan #2

Closed andrewgsavage closed 5 years ago

andrewgsavage commented 5 years ago

I saw a nice example of a duck typed extension array and thought it would be applicable to pint. A PeriodArray stores it's frequency in its PeriodDtype,

pd.core.dtypes.dtypes.PeriodDtype(freq = "M")
period[M]

So I've made a branch where PintArray stores it's unit in PintType, which works nicely:

import pandas as pd
import pintpandas

df=pd.DataFrame({
        "length":pd.Series([1,2],dtype="pint[m]"),
        "area"  :pd.Series([1,4],dtype="pint[m^2]"),
             } )
print(df)
  length area
0      1    1
1      2    4

df.dtypes
length         pint[meter]
area      pint[meter ** 2]
dtype: object

df.length
0    1
1    2
Name: length, dtype: pint[meter]

Previously the quantity containing a 1d array was stored in PintArray._data . Now that the unit is in the EAtype, ._data can store the magnitudes like most other EAs do, making the implementation more relatable. I'd prefer to use this duck typed version in the future.

That leaves three versions which it'd be good to have history for (although that could just be in my repo?)

  1. pint/master, failing tests
  2. andrewgsavage/pint/uprev_pandas, passing all but one test (test needs redefining)
  3. andrewgsavage/pint-pandas/duck_typed, passing all but same test

Should we push each of those versions to this repo in that order to maintain the history/review changes?

hgrecco commented 5 years ago

My first suggestion would be. 1.- Remove pandas support from the main pint repo 2.- Put an import of your pint repo within try/except (pint should work with or without your package).

I will see the implemententation and come back to you.

hgrecco commented 5 years ago

Regarding the tests, I think pintpandas should be tested only in the pintpandas repo

znicholls commented 5 years ago

Should we push each of those versions to this repo in that order to maintain the history/review changes?

If you want. I'd be tempted to just do a clean 'new start' with your duck type. I think the messiness of our previous implementations can just sit in repos and doesn't necessarily need to be here (or have I misunderstood your comment?).