frictionlessdata / tabulator-py

Python library for reading and writing tabular data via streams.
https://frictionlessdata.io
MIT License
236 stars 42 forks source link

Move Stream force_strings behaviour to processor so it applies to sample #233

Closed alightwing closed 5 years ago

alightwing commented 6 years ago

Currently when force_strings is True, stream.read() row values are converted to strings but stream.sample row values are not. We expect that if force_strings is passed to Stream, all output values will be strings regardless of access method.

This change moves the existing force_strings behaviour into a processor, and then appends that to the list of processors to apply to the dataset in __apply_processors. This ensures that all stream output rows are converted to strings.

alightwing commented 6 years ago

On further investigation it turns out force_strings invoking the helpers.stringify_value function converts None values into string 'None' values, which I didn't anticipate and may result in this PR being un-landable.

Ideally string conversion would be done by an inverse of the cast_value method in tableschema.field.Field, which casts string values as appropriate Python types -- here we want to do the reverse and cast Python types as appropriate strings.