Closed mgmarino closed 3 years ago
The link to the relevant documentation in the Java Duration class.
They note in the docs there that negatives are not part of the ISO 8601 standard.
My suspicion, however, is that many users need to parse "ISO8601-like" strings that include these extensions. This is indeed my case as well. As such, I would propose supporting the negative as Java Duration does it, e.g.:
"PT-6H3M" -- parses as "-6 hours and +3 minutes" "-PT6H3M" -- parses as "-6 hours and -3 minutes" "-PT-6H+3M" -- parses as "+6 hours and -3 minutes"
take
while investigating this, I found a link to a thread on the postgresql mailing list discussing the same issues, which references an extension to 8601: https://www.postgresql.org/message-id/9q0ftb37dv7.fsf%40gmx.us
Related to #37159, #29773, #36204, splitting out only dealing with the behavior of the negative sign when parsing ISO 8601 Durations.
The current behavior is somewhat counter intuitive:
"P-6DT0H50M3.010010012S"
parses asTimedelta( days=-6, minutes=50, seconds=3, milliseconds=10, microseconds=10, nanoseconds=12, )
, and the negative is only allowed right after the P descriptor. A negative in any other position will raise an error.This comment notes that the original spec for 8601 doesn't mention negativity at all, but that some other "extensions" (e.g. usage of it in Java Duration) do support it. I have been unable to find the detailed ISO 8601 spec.
As far as I can tell, there are a few possibilities to deal with this here:
"-P6DT1H" = Timedelta('-7 days +23:00:00')
and/or"P7DT-1H3M" = Timedelta('6 days 23:03:00')
_Originally posted by @mgmarino in https://github.com/pandas-dev/pandas/pull/37159#discussion_r506726762_