Open rumpuslabs opened 2 months ago
take
@alamb shouldn't the csv reader also throw an error because "bob" is not a valid dictionary?
I agree the discrepancy between UTf8 and Dictionary looks like a bug
@alamb shouldn't the csv reader also throw an error because "bob" is not a valid dictionary?
I think "bob"
is a valid value for a DictionaryArray (whose values are Strings)
Describe the bug
Related to #7797
Empty strings in CSV files aren't being interpreted as null when using a
Dictionary(_, Utf8)
To Reproduce
Create a simple
input.csv
file like this:Run the following code:
Expected behavior
I was expecting the output to look like this:
But the full dataset is returned instead:
Additional context
Tested on v41.0.0
Replace
DataType::Dictionary(Box::new(DataType::UInt8), Box::new(DataType::Utf8))
withDataType::Utf8
and it works.