skybrud / Skybrud.Umbraco.Redirects.Import

Import and export addon for Skybrud.Umbraco.Redirects.
MIT License
7 stars 11 forks source link

CsvRedirectsProvider: BOM () at beginning of CSV file #4

Closed hfloyd closed 2 years ago

hfloyd commented 3 years ago

I've come across an issue with the CsvRedirectsProvider, when reading a locally-stored CSV file...

"MapCsvColumns()" was failing because it was reading the first column name with "" in front when using the "Encoding.Auto" option.

Some research turned up the following: https://stackoverflow.com/questions/6260911/how-remove-the-bom%C3%AF-characters-from-a-utf-8-encoded-csv

I un-commented the bit of encoding-checking code you had present: image

And when debugging, I see that it correctly determined the encoding as UTF8 - but then throws a "Stream was not readable." exception using that encoding.

If I pass-in "CsvImportEncoding.Utf8" explicitly as the option, The stream reads without error, but now it reads the first column name as starting with "\ufeff" - another BOM character.

I think "CsvFile.Load()" called here: image might need to be checked.

This CSV file was saved from MS Excel, which I recall seems to ALWAYS have encoding issues for some reason... but considering how common it is to save CSVs from Excel, I think the code needs to handle this.

I was able to strip it out using some code i found in https://stackoverflow.com/questions/1317700/strip-byte-order-mark-from-string-in-c-sharp

abjerner commented 2 years ago

Hi @hfloyd

The package uses my Skybrud.Csv package for working with CSV files, I recently added some logic to the package for auto-detecting the encoding of the file to be imported, including functionality for detecting BOM headers for various Unicode encodings.

To the extent that I've tested this, I haven't encountered any issues (although that might happen when the logic is exposed to real life CSV files).

Anyways, as this should now ben handled, I'm closing the issue.