w3c / epubcheck

The conformance checker for EPUB publications
https://www.w3.org/publishing/epubcheck/
BSD 3-Clause "New" or "Revised" License
1.64k stars 402 forks source link

Unjustified error message "Invalid byte 2 of 4-byte UTF-8 sequence" #1548

Open Matt-1 opened 10 months ago

Matt-1 commented 10 months ago

I've encountered this error and created a stripped-down version of the original EPUB that still exhibits the error: invalidUtf8Sequence.epub

For this file, EPUBCheck v5.1.0 reports

Validating using EPUB version 3.3 rules.
FATAL(RSC-016): invalidUtf8Sequence.epub/OEBPS/html/Chapter_6.xhtml(88,792): Fatal Error while parsing file: Invalid byte 2 of 4-byte UTF-8 sequence.
[...]

I believe this is a false positive. At least I can't find anything wrong with the HTML file.

rdeltour commented 10 months ago

Thanks for the report and the test file! I will have a look.