cds-snc / notification-planning

Project planning for GC Notify Team
5 stars 0 forks source link

Automatically handle notifications upload files not encoded with UTF-8 #796

Open jimleroyer opened 2 years ago

jimleroyer commented 2 years ago

Description

As a user preparing a CSV upload, I would like to upload my files with minimal friction, especially when these come out of different programs with different character set encoding, So that I can send notifications as easy as possible with no blockers, in a transparent manner.

WHY are we building?

Users had bad experience uploading a file with a Windows coding page such as CP-1252. Notify rejects accents specified in that character set without much explanation.

WHAT are we building?

An automatic conversion to UTF-8 when detection of another coding page is used. Also better validation message to the user when such automatic conversion is not possible.

VALUE created by our solution

Better user experience and less support.

Acceptance Criteria** (Definition of done)

QA Steps

Additional info

The CSV upload error might trigger from this point in code:

A few stackoverflow entries on potential solutions:

The chardet Python module is mentioned many times. There is also a confidence score we can get out of the file detection with this module which could be handy.

yaelberger-commits commented 2 years ago

Can we bring this forward for refinement/estimation discussion this Wednesday @jimleroyer to help make a discussion about path forward, tech solution or documentation? We can discuss the two cards together, this and #792