oliveryh / pandora

0 stars 0 forks source link

Store HTML in cache / model #43

Closed oliveryh closed 1 year ago

oliveryh commented 2 years ago

Requesting HTML once highlights are extracted causes two immediate issues:

  1. The page might not longer be available once it's finally read
  2. The contents of the page may have changed slightly, making it harder to extract highlight formatting
  3. We're having to put the HTML payload multiple times

If we cache the HTML or extract it again from the kepub file, it should improve the quality of the various highlights etc.