Open talebbits opened 7 years ago
What fails, how and under which circumstances ?
when timestr = '2017-08-29T09:16:45.0631274-05:00'
this function does no handle this format https://github.com/rohe/pysaml2/blob/fd7a4f694b137a92f2a8b7f502d51fc21e3528c7/src/saml2/time_util.py#L232
Where does this timestr comes from ?Is it an IssueInstant
, a NotBeforeOrAfter
? Can we get the stacktrace ?
I think this is a valid bug and we probably need to come up with a different regexp to validate dateTime even if the one we use now matches the most commonly used format
It comes as <samlp:Response
and it is in every datetime attribute including IssueInstant
I think that this could be helpfull
import datetime
import dateutil.parser
def getDateTimeFromISO8601String(s):
d = dateutil.parser.parse(s)
return d
s = '2017-08-29T09:16:45.0631274-05:00'
getDateTimeFromISO8601String(s)
datetime.datetime(2017, 8, 29, 9, 16, 45, 63127, tzinfo=tzoffset(None, -18000))
Tested on Python 2.7.12. It also worked with the following combinations
s = '2017-08-29T09:16:45.0631274-05:00'
s = '2017-08-29T09:16:45.0631274'
s = '2017-08-29T09:16:45.0631'
s = '2017-08-29T09'
s = '2017-08-29 03:01:05'
# italian format
s = '29/08/2017 03:01:05'
# also if string seems truncated
s = '2017-08-29T'
In other words we could subsitute the actual str_to_time function with dateutil.parser.parse or, more in the deep, refactor the entire time_util.py file with standard python libraries.
Looking at the SAML2 xsd, I see that
<attribute name="IssueInstant" type="dateTime" use="required"/>
IssueInstant
is of type xsd:dateTime
. The XML Schema datatypes spec mandates
The ·lexical space· of dateTime consists of finite-length sequences of characters of the form:
'-'? yyyy '-' mm '-' dd 'T' hh ':' mm ':' ss ('.' s+)? (zzzzzz)?
This means that 2017-08-29T09:16:45.0631274-05:00
is a valid value which makes this issue a proper bug report, and it should be fixed 👍
According to the dateTime datatype spec, it is mandatory to include the year, the month, the day, the hour, the minutes and the seconds. One can omit only the second fragments and the timezone info. This means that some of the results that dateutil.parser.parse
returns (as proposed by @peppelinux) are invalid.
I feel it is OK to relax this rule and accept datetime values such as 29/08/2017 03:01:05
and 2017-08-29T
which would otherwise be invalid by the spec. If people think this should be strict we can do that by fixing the TIME_FORMAT_WITH_FRAGMENT
regex and make consistent use of that.
I also agree that time_utils.py
should be refactored to use standard built-in methods to do time calculations.
I went through the code and it seems pysaml2
uses many different libs:
time
builtin moduledatetime
builtin modulecalendar
builtin moduledateutil
libpytz
libAdditionally, not all time-related operations are handled through time_utils.py
.
I also went through the XML Schema spec about dateTime datatype and I feel very weird reading that
The ·value space· of dateTime is closely related to the dates and times described in ISO 8601
meaning that the dateTime
datatype does not represent/support the full ISO 8601 specification, and at the same time there is no clear text about the differences. I cannot understand why someone
found it better to semi-specify a new format instead of taking advantage of a specification about the exact thing they wanted to specify in the first place 😞.
Regardless, I started looking for libraries with ISO 8601 support. To my surprise, there is not much support for a well-defined standard, but there are lots of libs. I looked at the python modules {time
, datetime
, calendar
}, and libraries: pytz
, arrow
, moment
, iso8601
, isodatetime
, ciso8601
, numpy/datetime64
, pandas/timeseries
and finally aniso8601
.
I think that aniso8601
stands out as the most complete implementation that is taking edge cases seriously. Additionally, it is a pure python implementation and works with the standard datetime python type. I think a refactoring should be based on that lib.
@c00kiemon5ter you're right, I didn't want to introduce invalid datetime lenght but only show dateutils usage flexibility. Nothing more. I agree that iso format should be mandatory.
aniso8601 looks good.
I found also: pyiso8601 and iso8601utils It seems also that starting from Python 3.7 there's a native support with colon delimiters in UTC offsets. It also come with datetime.fromisoformat function.
Coming back to this.
The SAML2 core specification specifies:
1.3.3 Time Values
All SAML time values have the type
xs:dateTime
, which is built in to the W3C XML Schema Datatypes specification [Schema2], and MUST be expressed in UTC form, with no time zone component. SAML system entities SHOULD NOT rely on time resolution finer than milliseconds. Implementations MUST NOT generate time instants that specify leap seconds.
This conflicts with the dateTime
datatype specification as it disallows the timezone component and considers all dates to have been specified as UTC beforehand. The dateTime
specification defines two timelines - one for timezoned dateTime
s and one for untimezoned dateTime
s. The SAML2 spec essentially dictates that there is only one timeline, always timezoned as UTC.
This makes things easier for the implementers, as for example, they need not consider how to compare dateTime
s from different timelines (a timezoned dateTime
with an untimezoned datetime
.)
As such, the originally posted date 2017-08-29T09:16:45.0631274-05:00
is invalid, as it contains timezone info -05:00
. The correct representation is 2017-08-29T14:16:45.063127
(converted to UTC/Z/±00:00 and timezone info removed.)
The question now becomes whether implementations actually hold that promise, and if not, how do we handle it..
Personally I would like to implement the most flexible and smart way to handle time, with or without timezone, with the possibility to succesfully handle exceptions on theri occurrence.
It's more like a hopefull whish, not a real TODO
This was further discussed in the mailing list and ..it seems everything is backwards. This is what the technical committee had to say:
Subject: Re: [saml-dev] dateTime "time zone component" Date: Wed, 2 Oct 2013 15:53:00 +0000
The SAML spec has this part about "MUST be expressed in UTC form, with no time zone component." which would literally support the format given above (I think; xmlschema dateTime itself would certainly allow for that) but as Ian (and Olav, I think) point out in this thread
https://groups.google.com/forum/#!topic/simplesamlphp/7m07m9gLzzA
the spec probably intended to say no timezone /offset/?
I would think so.
-- Scott
which means that the timezone information should be there, but it is only allowed to be 'Z'.
The plan is:
All dates are normalised to UTC and can be compared and worked-with under that assumption. Dates we produce should be in ISO-8601 format always in UTC with timezone info present as 'Z'.
ValueError: time data '2017-08-29T09:16:45.0631274-05:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
https://github.com/rohe/pysaml2/blob/fd7a4f694b137a92f2a8b7f502d51fc21e3528c7/src/saml2/time_util.py#L19