rschroll / beru

The Basic Epub Reader for Ubuntu
http://rschroll.github.io/beru/
GNU General Public License v3.0
25 stars 12 forks source link

Add a page slider to the contents menu... #77

Open stuartlangridge opened 9 years ago

stuartlangridge commented 9 years ago

...which allows someone to jump to a place in the book directly rather than having to skip through page by page. I find myself missing this a lot.

rschroll commented 9 years ago

Thanks for the patch. If I'm understanding it correctly, the slider allows you to navigate within the current HTML document in the Epub.
Unfortunately, there is no consistency about how to split content up into documents within Epub files. Sometimes there's a single HTML file. Sometimes there's one per chapter. Sometimes it's something else. This means that the action of the slider depends on the internals of the document, and that's not something we should be exposing the user to.

Pull request #40 offers another approach for calculating page numbers, but it turns out this is a hard problem. At the moment, the approach I like best would be to give up on trying to calculate real page numbers and just try to go for a rough fraction of the way through the book. A simple approach would be to call a point a fraction f through the nth document of N a fraction F = (n - 1 + f)/N of they way through the whole book. It's not entirely accurate, since different documents could be different lengths, but it'd be good enough to give you a rough idea where you are. A better approach might weight these documents by their file size.

Also, I should let you know that I've been toying with a redesign of the toolbar and table of contents [1]. This would make the ToC into a full-sized page, which probably works in favor of a navigation slider.
Don't worry about that now, but I wanted to let you know that it's out there any may change how the slider would get displayed. (I'd be tempted to put it at the bottom of the ToC page, where it'd be easier to reach.)

[1] https://plus.google.com/108747901910183509998/posts/J3aipVzr3vX

stuartlangridge commented 9 years ago

Nope. Based on my, admittedly limited, testing it works in the whole book, not just this section. Give it a try; I may of coyrse be wrong, but I wouldn't have sent it were it only for this html file :)

On Wednesday, 4 March 2015, Robert Schroll notifications@github.com wrote:

Thanks for the patch. If I'm understanding it correctly, the slider allows you to navigate within the current HTML document in the Epub. Unfortunately, there is no consistency about how to split content up into documents within Epub files. Sometimes there's a single HTML file. Sometimes there's one per chapter. Sometimes it's something else. This means that the action of the slider depends on the internals of the document, and that's not something we should be exposing the user to.

Pull request #40 offers another approach for calculating page numbers, but it turns out this is a hard problem. At the moment, the approach I like best would be to give up on trying to calculate real page numbers and just try to go for a rough fraction of the way through the book. A simple approach would be to call a point a fraction f through the nth document of N a fraction F = (n - 1 + f)/N of they way through the whole book. It's not entirely accurate, since different documents could be different lengths, but it'd be good enough to give you a rough idea where you are. A better approach might weight these documents by their file size.

Also, I should let you know that I've been toying with a redesign of the toolbar and table of contents [1]. This would make the ToC into a full-sized page, which probably works in favor of a navigation slider. Don't worry about that now, but I wanted to let you know that it's out there any may change how the slider would get displayed. (I'd be tempted to put it at the bottom of the ToC page, where it'd be easier to reach.)

[1] https://plus.google.com/108747901910183509998/posts/J3aipVzr3vX

— Reply to this email directly or view it on GitHub.< https://ci6.googleusercontent.com/proxy/cioF1pvmsxYLyryfU0GYMZGqYczkG16fmTEtW37o_6CPGlNiEtf3JIN07o1BPc15Jg2gpJ6CKbV2M7ydbG3eCTix79wX0PZB0WuQAtWy-x8JwAmswfkAR9vGknPgM-ns0BKRlMkyk0WLUj4t5Xw88oaGzC1ohA=s0-d-e1-ft#https://github.com/notifications/beacon/AEJ4_riPQd5ZzasR-ycP5z4cHV13whM9ks5nx4RigaJpZM4DpjJ-.gif

New Year's Day -- everything is in blossom! I feel about average. -- Kobayashi Issa

rschroll commented 9 years ago

For Project Gutenberg books, it gives the full book. But on some commercial ebooks, I get chapter-long sliders.

It's terribly formatted, admittedly, but checkout this edition of Huck Finn [1]. You get a bunch of two-page sliders before you acutally get to the contents. Then the contents are grouped into 64-page groups (at least on my device). I haven't cracked it open to check, but I'd bet cash money that that reflects the internal structure.

[1] https://openlibrary.org/works/OL15739198W/The_adventures_of_Huckleberry_Finn_%28Tom_Sawyer%27s_comrade%29

stuartlangridge commented 9 years ago

Huh. That's a Monocle bug, then, not that that's an excuse. I'll take a look at that. I use the Monocle page data; I'm not calculating it myself or anything :)

On Wednesday, 4 March 2015, Robert Schroll notifications@github.com wrote:

For Project Gutenberg books, it gives the full book. But on some commercial ebooks, I get chapter-long sliders.

It's terribly formatted, admittedly, but checkout this edition of Huck Finn [1]. You get a bunch of two-page sliders before you acutally get to the contents. Then the contents are grouped into 64-page groups (at least on my device). I haven't cracked it open to check, but I'd bet cash money that that reflects the internal structure.

[1]

https://openlibrary.org/works/OL15739198W/The_adventures_of_Huckleberry_Finn_%28Tom_Sawyer%27s_comrade%29

— Reply to this email directly or view it on GitHub.< https://ci5.googleusercontent.com/proxy/2gFKTcdx8ciZzboNt5JyvJECqwg4uci_p6nv1WLPNn9EjLeng14l2dlXHFghRqDa64_Vxd1rq1OsKGnMXavnIIaTanQzSqG3_BdZNckmq1lAXx2o9TgKlo1hqTcFXefWx4-oPT2ExlVght3o36hmYjcZaiIvNQ=s0-d-e1-ft#https://github.com/notifications/beacon/AEJ4_kT8ZVsMEVbE6VK5kJWCg3-yF7YGks5nx4y4gaJpZM4DpjJ-.gif

New Year's Day -- everything is in blossom! I feel about average. -- Kobayashi Issa

rschroll commented 9 years ago

I suspect that Monocle is working as designed, though that may not be so helpful. Monocle is designed for use on the web, so it only delivers the components as necessary. That means that Monocle can't work out how many total pages there are until you happen to have triggered all the components to load.

I don't know if you've seen it, but Monocle has a build-in scrubber [1]. From what I can tell at a quick glance, it uses the algorithm I proposed to get an estimate of the global fraction. We don't want to use it directly, but we can probably steal some code from there to use.

Alternatively, I noticed that it was actually rather convenient to have a per-chapter scrubber in books that were so arranged. (Otherwise, it'd be rather touchy to try to get a specific page.) We could compare the number of components to the number of entries in the table of contents. If they're about equal, we could put the scrubber near the chapter title and just have it control that chapter. If there's only one component, we could use a global scrubber. And if they're off, we could just disable to scrubber.

stuartlangridge commented 9 years ago

OK. I have tested the Twain book, and I agree; Monocle is reporting misleading page data. Bah, and again bah. It'd be worth looking at the Monocle built-in scrubber to get page data, although "find whoever made it report page data as though it applied to the whole book even though it is html-file specific and then punch them in the throat" is a roughly equal approach.

Seriously, monocle reports page data. It should not report that incorrectly, and if that's even remotely fixable, we should fix it.

rschroll commented 9 years ago

Forgot the link to the scrubber code: https://github.com/joseph/Monocle/blob/master/src/controls/scrubber.js

I think the problem is that the only way to get the total page count is to layout the entire book, which is expensive, both in terms of network, for remote content, and in terms of DOM calculations, which are killers, performance-wise. The Monocle folk made the decision to layout only as much as they needed (namely, the current component), and thus only give report stuff relative to the component. Though you may find the names misleading, I don't think the documentation is. It describes pageNumber() as "The page number of this point within the component."

Looking through the Place API, I note the percentageOfBook() and percentOfBookToLocus(), which might be useful for a book-length scrubber. They seem to do a weighted calculation, though it looks like that weight has to be provided in the metadata. (We could estimate it from file sizes.)

stuartlangridge commented 9 years ago

Ya; I'm using the percentage, as per https://github.com/stuartlangridge/beru/commit/6be320f3f00d3c243288f3aa13c284be37ff1bd0#diff-26aeb1b8786cc0911ac0c0bd03e55d41R660 to calculate the total number of pages; we get a pageSizePercentage and a percentageThrough and get the total number of pages as 1/pageSizePercentage. I ask Monocle for the page number too but in theory I don't have to. This all feels doable to me, although perhaps it needs more thought than I've thought :)

stuartlangridge commented 9 years ago

I now see what you're getting at. The weights thing will need calculating, indeed, I think, from file sizes. I'll take a look at how we might do that.

rschroll commented 9 years ago

QuaZipFileInfo has a uncompressedSize attribute that will probably be what we want.

stuartlangridge commented 9 years ago

I have pushed an updated branch which uses uncompressedSize, but don't merge it. It doesn't work all that reliably. Specifically, short chapters which nonetheless contain a few pages (things such as cover pages, introductions, and the like) sod up the calculations something dramatic. I'm not sure how best to resolve this, other than to have the C stuff contain a copy of the paging algorithm, calculate a (reasonably) accurate number of pages, and then calculate componentWeights based on that. This is exceedingly frustrating.

Perhaps the slider should just work on percentages or something, rather than trying to be clever with page counts? The goal here is to be able to jump roughly to a place in the book, rather than having to hit the page next button four hundred times to do so; moving to "page 411" is rarely what's actually wanted, especially since page numbers are close to meaningless in an ebook anyway...?

rschroll commented 9 years ago

I think this is essentially the same problem we were having over in #40.

A few disorganized thoughts follow: