konradhalas / dacite

Simple creation of data classes from dictionaries.
MIT License
1.72k stars 107 forks source link

Review current scope of the library (aka get rid of unused features) #38

Closed konradhalas closed 5 years ago

konradhalas commented 5 years ago

I'm not sure about some features currently implemented in dacite. The main goal of this project is to build a data class from a plain dictionary. It's not a serialization/desearialization or validation library. There are many such libs, e.g. DRF or marshmallow, and I don't want to create another one.

I'm talking about following features:

Even from code point of view all of those features live in a separate module -config - and they can be easily decoupled from data classes at all. So maybe this is a good idea for a new library which will allow to transform your dictionary to different dictionary according to provided rules (remap, flattened, prefixed...), but I don't know should we have such features in dacite.

On the other hand it easier for users to install one lib instead of two.

So I see the following solutions:

  1. Do not change anything - leave it as it is
  2. Get rid of them
  3. Make it 100% decoupled from data classes, e.g.
dacite.from_dict(
    data_class=X, 
    data=dacite.transform_data(data, config=TransformConfig(...)), 
    config=Config(...),
)

Nr 2 is my favourite one.

What do you think @rominf @jasisz?

It's a good time for such decisions - I want to release 1.0.0 soon.

rominf commented 5 years ago

I've never used those features. Also, I think their implementation is not the best (it's hard to use them without extensive reading of documentation). I agree about existing libraries. I think before dropping those features would be nice to provide examples in the documentation of how to use dacite in combination with existing libraries for hard cases. You mentioned DRF and marshmallow, but for me, it's non-obvious how to use them in combination with dacite.

condemil commented 5 years ago

I use cast / transform, but I don't mind if you get rid of all of them, but I will be able to pass my own method that will do transformation (no matter what transformation it will be - cast, flatten or something else).

This is how it can be done:

dacite.from_dict(
    data_class=X, 
    data=data, 
    transform=some_method,
)

This way you will leave the possibility to do the same what I was able to do previously, but in a different way.

pgilad commented 5 years ago

I don't use the extra features, but I think they're nice to have. The general notion should be whether this library aims to be top-performant, or feature rich (can't have them both usually).

My use-case was wanting to model json-responses into data classes. I don't really care about the rest of the features, but I can clearly see how in development environment I'd want to do validation, although that could be the speciality of a different library (or plugin).

I wonder what's the performance cost of having all the features, but turned off. If you can benchmark a version that parses light jsons, and heavy jsons, which has 2 variations:

If the difference in performance is negligible (1-10%) than I think you should keep them

AngelEzquerra commented 5 years ago

I feel very strongly that the Config functionality should be kept. Even though there is some overlap between those features and what more complex "validation" libraries such as marshmallow offer, I believe that the current dacite design offers a great balance between simplicity and power.

A recent example that I faced: I had a dict with some fields that used strings to represent datetime objects. To import those into a dataclass using dacite I simply had to do:

` from dataclasses import dataclass, field from datetime import datetime import dacite

@dataclass class RequestClass: time: datetime = field(default_factory = datetime.now)

request_dict = {'time': '2019-04-19T13:20:47.723787'} request_obj = dacite.from_dict(RequestClass, request_dict, config=dacite.Config(transform={'time': datetime.fromisoformat})) ` Doing this with marshmallow would would have been way more complex. You would need to define a Schema "copy" of the RequestClass datastructure. In this case there is a single field, but if we had let's say 20 fields in the RequestClass the marshmallow schema would need to define everyone of those 20 fields even though we would only want to transform a single one of them.

So I would highly encourage you guys to keep the current functionality. It is really useful and easy to use.

konradhalas commented 5 years ago

Guys, thank you very much for your feedback, really appreciate.

I think that the final decision is:

1) let's remove everything which is related to a dict shape transformation (flattened, remap, prefixed) 2) we will leave everything which is based on a type information retrieved from data class (cast, cast_all (not implemented yet)) 3) last but not least - I will make transform behave in a different way - instead of "field-name based" ({"my_field": datetime.fromtimestamp}) it will be "type based" ({datetime: datetime.fromtimestamp}) + probably I will change name of this param

Thank you once again.

konradhalas commented 5 years ago

FYI - new version of transform is almost ready to merge - please check https://github.com/konradhalas/dacite/pull/55.

New Config will have only "type-based" transform feature, so instead of {field_name: callback} you will have to provide {field_type: callback}. I think that it will cover many cases.

Simple example (based on @AngelEzquerra case):

from dataclasses import dataclass
from datetime import datetime

import dacite

@dataclass
class X:
    date: datetime

data = {
    'date': '2019-04-19T13:20:47.723787',
}

x = dacite.from_dict(
    data_class=X,
    data=data,
    config=dacite.Config(transform={datetime: datetime.fromisoformat}),
)

assert x == X(date=datetime(2019, 4, 19, 13, 20, 47, 723787))