best-doctor / import_me

Python library to simplify importing data from xls/xlsx
MIT License
11 stars 12 forks source link

Add DTO support #52

Closed PashaWNN closed 4 days ago

PashaWNN commented 5 days ago

This PR adds support to define parsers as Pydantic or Dataclass DTO.

It can be used like this:

    class PersonDto(pydantic.BaseModel, PydanticImportableDtoMixin):
        parser_base = BaseXLSXParser
        row_index: int
        first_name: str = pydantic.Field(**{META_HEADER: 'First Name'})
        last_name: str = pydantic.Field(**{META_HEADER: 'Last Name'})

    xlsx_file = xlsx_file_factory(
        data=[
            ['First Name', 'Last Name'],
            ['Ivan', 'Ivanov'],
            ['Petr', None],
        ],
    )
    persons = PersonDto.parse_from_file(file_contents=xlsx_file)

    assert isinstance(persons, ParsingResult)
    assert persons.parsed_items == [
        PersonDto(
            first_name='Ivan',
            last_name='Ivanov',
            row_index=1,
        ),
    ]
    assert persons.errors == {2: ['row: 2, column: 1, Column Last Name is required.']}

Support for other DTO backends also could be easily added by the user:

def get_marshmallow_fields(dto: marshmallow.Schema) -> list[DtoField]:
    ...  # the code to extract Column attributes from DTO

class MarshmallowImportableDtoMixin(BaseImportableDtoMixin):
    fields_getter = get_marshmallow_fields