readium / swift-toolkit

A toolkit for ebooks, audiobooks and comics written in Swift
https://readium.org/mobile/
BSD 3-Clause "New" or "Revised" License
223 stars 96 forks source link

HTML injection fails if the resource contains a commented-out `<html>` tag #346

Open mbmoris opened 8 months ago

mbmoris commented 8 months ago

Bug Report

The book is not browsable in any way

What happened?

The book is opened, the first page with the cover is displayed, but:

Expected behavior

Change page or jump to a specific chapter.

How to reproduce?

Open the attached sample book

Environment

Development environment

Testing device

mickael-menu commented 8 months ago

This is caused by the same issue I described on Slack for the Kotlin toolkit: https://readium.slack.com/archives/C703MSTQU/p1698229085820339?thread_ts=1698225175.649879&cid=C703MSTQU

The issue is that the book contains commented-out tags:

<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<!--
<!DOCTYPE html>
-->

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="it" >

<!--
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="it" dir="ltr">
-->

But when we inject our own styles and scripts, we do it with regexes to locate the <html tags, so you get false positives in this case. Ideally we would use a proper HTML parser, but we decided to stay as close to the original source as possible when we modify it.

This is unlikely the core maintainers will tackle this fringe issue, but we would welcome a contribution to fix it. Maybe with a more complex regex you could handle that.

It's located here, if you want to give it a go: https://github.com/readium/kotlin-toolkit/blob/482ab0c2d759b4484762b0b823a953cc66661259/readium/navigator/src/main/java/org/readium/r2/navigator/epub/css/ReadiumCss.kt#L239

With the Swift toolkit, the regex is located here: https://github.com/readium/swift-toolkit/blob/7c04a9892b951bb13f7099b4ab74064b43557e00/Sources/Navigator/Toolkit/HTMLInjection.swift#L65

I'll keep the issue opened but like I said, it's unlikely I will tackle this anytime soon. So a contribution is most welcome!