Fix: characters with special characters are secretly two characters

When importing pretty much all of Jean's recordings, any character with a circumflex or a macron gets stored as two characters: ˆ+ e instead of ê. There's a script in progress that's supposed to find the unicode character for ˆ, which is \xcc\x82 and replace that with the correct single character.

This script isn't working.

The only way I've done this successfully is by doing it manually.

The script is here: https://github.com/UAlbertaALTLab/recording-validation-interface/blob/production/validation/management/commands/lookforcombinedcharacters.py

UAlbertaALTLab / recording-validation-interface

Fix: characters with special characters are secretly two characters #440