pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.65k stars 17.92k forks source link

Weird Datetime Behaviour #3002

Closed justinvf-zz closed 11 years ago

justinvf-zz commented 11 years ago

I am seeing odd behavior of dates being converted to datetime64[ns] times when that is inappropriate.

    In [272]: file = StringIO("xxyyzz20100101PIE\nxxyyzz20100101GUM\nxxyyww20090101EGG\nfoofoo20080909PIE")

    In [273]: df = pd.read_fwf(file, widths=[6,8,3], names=["person_id", "dt", "food"], parse_dates=["dt"])

    In [274]: df
    Out[274]: 
      person_id                  dt food
    0    xxyyzz 2010-01-01 00:00:00  PIE
    1    xxyyzz 2010-01-01 00:00:00  GUM
    2    xxyyww 2009-01-01 00:00:00  EGG
    3    foofoo 2008-09-09 00:00:00  PIE

Everything looks good. However:

    In [275]: df.dt.value_counts()
    Out[275]: 
    1970-01-17 00:00:00    2
    1970-01-18 00:00:00    1
    1970-01-25 08:00:00    1

    In [276]: df.dt.dtype
    Out[276]: dtype('datetime64[ns]')

The type is being stored as epoch nanoseconds internally, but usually displaying usually as the dates I want. But then everything is getting messed up when we drop down to numpy land for things like value_counts. What is going on? Is this a bug or am I making mistakes?

In [280]: pd.__version__
Out[280]: '0.10.1'

In [281]: np.__version__
Out[281]: '1.6.1'
wesm commented 11 years ago

This is a bug which still exists in master. Marking for 0.11

jreback commented 11 years ago

@wesm see my comments on PR, if enought interest, prob should create a TimeDelta64Index (just a fancy version of Int64)

wesm commented 11 years ago

Or just TimedeltaIndex I guess. Should have a scalar type analogous to Timestamp also. Not urgent but at some point...

jreback commented 11 years ago

yep, i had that problem in #2990, ideally return the scalar for min/max (at the end)

jreback commented 11 years ago

@justinvf this is merged in - pls update to master and try out thanks for the report

justinvf-zz commented 11 years ago

You guys are quick! Thanks so much for the bugfix.