Open kohlerjl opened 10 months ago
Coming in with another example and traceback from Pandas 2.2.2.
Exception:
In [50]: pd.Timestamp(dt.datetime(2022, 11, 6, 1, 6, 58), tz='America/New_York', fold=0)
---------------------------------------------------------------------------
AmbiguousTimeError Traceback (most recent call last)
Cell In[50], line 1
----> 1 pd.Timestamp(dt.datetime(2022, 11, 6, 1, 6, 58), tz='America/New_York', fold=0)
File timestamps.pyx:1865, in pandas._libs.tslibs.timestamps.Timestamp.__new__()
File conversion.pyx:412, in pandas._libs.tslibs.conversion.convert_to_tsobject()
File conversion.pyx:483, in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject()
File conversion.pyx:748, in pandas._libs.tslibs.conversion._localize_pydatetime()
File ~/mambaforge/envs/populus-env/lib/python3.10/site-packages/pytz/tzinfo.py:366, in DstTzInfo.localize(self, dt, is_dst)
360 # If we get this far, we have multiple possible timezones - this
361 # is an ambiguous case occurring during the end-of-DST transition.
362
363 # If told to be strict, raise an exception since we have an
364 # ambiguous case
365 if is_dst is None:
--> 366 raise AmbiguousTimeError(dt)
368 # Filter out the possiblilities that don't match the requested
369 # is_dst
370 filtered_possible_loc_dt = [
371 p for p in possible_loc_dt if bool(p.tzinfo._dst) == is_dst
372 ]
AmbiguousTimeError: 2022-11-06 01:06:58
Works:
In [49]: pd.Timestamp(dt.datetime(2022, 11, 6, 1, 6, 58), tz=dateutil.tz.gettz('America/New_York'), fold=0)
Out[49]: Timestamp('2022-11-06 01:06:58-0400', tz='dateutil//usr/share/zoneinfo/America/New_York')
Looks like we may be running into a known issue: https://github.com/pandas-dev/pandas/blob/a5e812d86deb62872f8d514d894a22931fc84217/pandas/_libs/tslibs/conversion.pyx#L747-L748
Thanks @kohlerjl for the workaround!
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
The fold argument to the Timestamp constructor appears to be ignored when tz is provided as a string, but works as expected for the corresponding dateutil.tz or zoneinfo objects.
On the current development branch, I get an AmbiguousTimeError error on the last two asserts
This behavior is at least better than the current release (2.1,2), which fails with an AssertionError because
pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=0, tz=tz)
returns the incorrect timestampTimestamp('2023-11-05 01:30:00-0800', tz='US/Pacific')
Expected Behavior
I would expect the behavior of interpreting ambiguous timestamps with 'fold' provided to be the same when the timezone is defined as a string (e.g. tz='US/Pacific') as when using the equivalent zoneinfo or dateutil.tz timezone. I noticed that the 'fold' argument is not permitted when using a pytz timezone, but at least in that case a descriptive error is provided.
Installed Versions