Closed mikegerber closed 1 year ago
Tested on Python 3.11.3, Windows WSL Debian
Reproducer:
❯ cat reproduce
#!/bin/bash
# Must be run in bash for "echo -ne" to work. (not sh!)
set -x
echo -ne "This is a test." > gt.txt
echo -ne "\xEF\xBB\xBFThis is a test." > ocr-just-bom.txt
ls -l *.txt
dinglehopper gt.txt ocr-just-bom.txt
grep cer report.json # should be 0
If I create an empty file
gt.txt
(0 bytes) and a fileocr.txt
that only contains a BOM (3 bytes), dinglehopper computes a CER of Infinity. It should ignore the BOM.See also #79.