openeemeter / eemeter

An open source python package for implementing and developing standard methods for calculating normalized metered energy consumption and avoided energy use.
http://eemeter.openee.io/
Apache License 2.0
209 stars 66 forks source link

"OutOfBoundsDatetime: Out of bounds nanosecond timestamp" error. #433

Open Viktoriya-An opened 2 years ago

Viktoriya-An commented 2 years ago

An error is produced when using eemeter with the most recent version of pandas.

>>> meter_data_daily, temperature_data_daily, metadata_daily = eemeter.load_sample('il-electricity-cdd-hdd-daily')
>>> meter_data_billing, temperature_data_billing, metadata_billing = eemeter.load_sample('il-electricity-cdd-hdd-billing_monthly')
>>> baseline_end_date = metadata_billing['blackout_start_date']
>>> baseline_meter_data_daily, baseline_warnings_daily = eemeter.get_baseline_data(meter_data_daily, end=baseline_end_date, max_days=365)

After calling eemeter.get_baseline_data(), the error is: OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1677-09-21 00:12:43.

Package versions

Python==3.9.1
eemeter==3.1.0
pandas==1.3.2

Reverting pandas version back to 1.2.1 has fixed the issue.

Costo commented 2 years ago

I got the same issue: OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1677-09-21 00:12:43 The error originates from this call in get_baseline_data():

pytz.UTC.localize(pd.Timestamp.min)

The workaround is to call get_baseline_date() with both the start and end date:

baseline_start_date = datetime.date(2021, 1, 1)
baseline_end_date = datetime.date(2021, 12, 31)
baseline_meter_data_hourly, baseline_warnings_hourly = eemeter.get_baseline_data(
    meter_data, 
    start=baseline_start_date,
    end=baseline_end_date, 
    max_days=None)
Lisandro79 commented 2 years ago

In transform.py, the starting date is computed as

 if start is None:
        # py datetime min/max are out of range of pd.Timestamp min/max
        start_target = pytz.UTC.localize(pd.Timestamp.min) + timedelta(days=1)
        start_inf = True
    else:

If I change it into this

start_target = pytz.UTC.localize(pd.Timestamp.min + timedelta(days=1))

It works for me

Should we open a pull request?