plone / guillotina

Python AsyncIO data API to manage billions of resources
https://guillotina.readthedocs.io/en/latest/
Other
187 stars 51 forks source link

[Question] Datetime deserialization format #974

Open masipcat opened 4 years ago

masipcat commented 4 years ago

Hi,

I'm opening this issue to ask you what's the expected format of datetimes. https://github.com/plone/guillotina/blob/c2aba28395f8399680ec0b0ea84de05e94d972de/guillotina/json/deserialize_value.py#L164-L168

If we only expect ISO-8601 I think we could consider using dateutil.parser.isoparse.

Simple benchmark (python 3.8):

$ python -m timeit -s 'from dateutil.parser import parse;'  'parse("2019-01-26T10:00:00.000001+01:00")'

5000 loops, best of 5: 79.7 usec per loop
$ python -m timeit -s 'from dateutil.parser import parse, isoparse;'  'isoparse("2019-01-26T10:00:00.000001+01:00")'

20000 loops, best of 5: 10.6 usec per loop

Also, we could consider using pendulum parser (this function returns <class 'datetime.datetime'>!):

$ python -m timeit -s 'from pendulum.parsing import parse;'  'parse("2019-01-26T10:00:00.000001+01:00")'

200000 loops, best of 5: 1.56 usec per loop

What do you think?

vangheem commented 4 years ago

I guess we could fallback if it was not parsable by the faster parsers.

I would be worried about removing parsing capability though.

Is it worth it if we have fallbacks?

masipcat commented 4 years ago

I guess it depends on which formats we expect to support.

Pendulum, by default (strict=True), supports the RFC 3339 format, most ISO 8601 formats and some other common formats (according to their doc https://pendulum.eustace.io/docs/#parsing).

Examples:

image

In case we want to support other formats too, we can just set strict=False and in that case pendulum calls dateutil.parse as a fallback (https://github.com/sdispater/pendulum/blob/c99f24215faf1fca544858a7a1ec6ff3538aa34f/pendulum/parsing/__init__.py#L107-L137).

PS: I reimplemented the IDatetime deserializer in our application to fit our needs (for a different thing) but we are ok only supporting RFC 3339/ ISO 8601. That's why I'm asking which dates formats we expect and if it's interesting to have it in Guillotina or not :)

vangheem commented 4 years ago

Okay, it seems to me that strict=False is the best option.

I think the design was meant to allow guillotina to be permissive and provide good parsing support for generic formats. I don't actually like that because I think dates should follow common iso formatting rules; however, I imagine we'll break people if we remove it.

masipcat commented 4 years ago

Also, if you want we can do pendulum optional. If it's installed we use Pendulum parser's with strict=False. Otherwise, we use dateutil.parse as it is now. Sound good?

vangheem commented 4 years ago

sounds great!