open-sdg / sdg-build

Python package to convert SDG-related data and metadata between formats
MIT License
6 stars 22 forks source link

Allow a dtype parameter in CSV data input #271

Closed brockfanning closed 3 years ago

brockfanning commented 3 years ago

Fixes https://github.com/open-sdg/open-sdg/issues/1271

This allows a dtype parameter to be passed into the InputCsvData class, which is passed along to read_csv().

One reason this might be useful is if a column (like maybe GeoCode) has all numeric values but needs to be treated like a string. For example, if the value 0227 needs to be parsed exactly as 0227 rather than automatically converted to 227.0, as read_csv normally does for numeric columns.

Usage:

  - class: InputCsvData
    path_pattern: data/*.csv
    dtype:
      GeoCode: str