Fatal1ty / mashumaro

Fast and well tested serialization library
Apache License 2.0
774 stars 45 forks source link

datetime parsing does not handle generic ISO-8601 strings #27

Closed baodrate closed 3 years ago

baodrate commented 3 years ago

Currently the generated from_dict code calls datetime.fromisoformat():

https://github.com/Fatal1ty/mashumaro/blob/53cd94c34006aff7f8b4eaa29d7b5d7037ee4bfd/mashumaro/serializer/base/metaprogramming.py#L468-L471

fromisoformat is only designed to invert the strings generated by datetime.isoformat():

This does not support parsing arbitrary ISO 8601 strings - it is only intended as the inverse operation of datetime.isoformat(). A more full-featured ISO 8601 parser, dateutil.parser.isoparse is available in the third-party package dateutil.

According to mashumaro's documentation:

use_datetime: False  # False - load datetime oriented objects from ISO 8601 formatted string, True - keep untouched

I believe it should be within mashumaro's scope to handle generic ISO-8601 strings

Fatal1ty commented 3 years ago

Yes, there is such a problem. Native fromisoformat doesn't cover all the cases but it's the fastest way to parse most ISO 8601 datetime strings: https://github.com/closeio/ciso8601/issues/66#issuecomment-404848813. But you're right it's the mashumaro's scope to handle generic ISO-8601 strings, so I decided to add a way to choose the parser using field's metadata:

class DataClass(DataClassDictMixin):
    x: datetime = field(metadata={"deserialize": "ciso8601"})

See more examples here: https://github.com/Fatal1ty/mashumaro/tree/datetime-parsers#using-field-metadata Moreover, I think it's an obvious way to make mashumaro more extensible and I'm going to use it for other things like https://github.com/Fatal1ty/mashumaro/issues/11. Also it would be convenient to configure a default behaviour for all fields in one place like Meta class in Django models.

Fatal1ty commented 3 years ago

We can change deserialization method with version 1.18: https://github.com/Fatal1ty/mashumaro#using-field-metadata

Fatal1ty commented 3 years ago

To fully cover the original problem, here is an example with dateutil:

from datetime import datetime
from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin
import dateutil.parser

@dataclass
class A(DataClassDictMixin):
    x: datetime = field(metadata={"deserialize": dateutil.parser.isoparse})

print(A.from_dict({"x": "2019-01-02T07:00:00Z"}))
Bao-gxg commented 3 years ago

Perfect. I'll test it out soon, thank you!

movermeyer commented 1 year ago

As of Python 3.11, datetime.fromisoformat can handle all the important bits of the ISO 8601 spec.

I'll be backporting its code in https://github.com/movermeyer/backports.datetime_fromisoformat/issues/21, so it'll be available for earlier versions of Python 3.

FWIW, ciso8601 is still faster than any other parser.