dfurtado / dataclass-csv

Map CSV to Data Classes
Other
194 stars 21 forks source link

Populate dataclass fields in their order with row data #27

Closed mschwerhoff closed 3 years ago

mschwerhoff commented 4 years ago

As a convenient alternative to explicitly mapping column to field names, in case they don't directly match, it would be handy if dataclass-csv could populate dataclass' fields in the order they are declared. This would be particularly handy in the presence of CSV files that have many columns with cryptic headers (whose generation is not under the users control).

E.g. given the following CSV file

firstname, age
Jean, 77
Jim, 22
...

and dataclass

@dataclass
class Person:
  forename: str
  age: int

I would expected the following Person instances to be created:

User(forename='Jean', age=77)
User(forename='Jim', age=22)
...

That is, the fields from top to bottom are initialised with the row data from left to right.

dfurtado commented 4 years ago

Hi, thank you so much for creating this issue.

It is already possible to achieve this by using the mapping feature, like so:

reader = DataclassReader(personfile, Persons)
reader.map('firstname').to('forename')

It would be a good feature for sure but a problem I see to automatic assume the ordering is that wrong data can go to wrong fields in the dataclass.

mschwerhoff commented 4 years ago

I agree with you that relying on the order is risky, and that due to map, the feature does ultimately not add new functionality. It would be handy for prototyping in combination with CSV files with many columns, though. If the feature were added, it would probably be a good idea to document its potential unsafety in the API documentation.