VertNet / bels

Biodiversity Enhanced Location Services
Apache License 2.0
16 stars 1 forks source link

Excel adds BOM to UTF-8 on save as CSV causing column header rename to fail #39

Open debpaul opened 2 years ago

debpaul commented 2 years ago

Issue: Excel encoding CSV as UTF-8 is defaulting to UTF-8-BOM Result: BELS web app column rename fails resulting in the following error message. ErrorMsg_Odd_IdoHaveCountryCode Recreate this error using this file: testBELS1.csv

Details

Ideas Others will use Excel. So what to do?

  1. Can the BOM '\ufeff' be stripped in the script that checks/maps/renames column headers?
  2. Would this affect user's ability to re-open results?
  3. Or do the instructions need to have people start in the text editor (rather than in Excel?)
  4. Need to try from Google Sheets (guessing all will be well)
tucotuco commented 2 years ago

From @JasonBest via Slack.

@John Wieczorek John, I have only glanced at some of the code handling CSV at https://github.com/VertNet/bels/blob/main/bels/ and haven't wrapped my head around all the things you're doing to determine the CSV encoding but it looks like if you could find a way to sniff the presence of a BOM, you could use encoding=utf-8-sig rather than just utf-8. This will allow the CSV reader to handle the BOM.