Closed anjackson closed 2 years ago
I need to specify this for every collection, right?
https://github.com/ukwa/ukwa-pywb/blob/ebbb52207d71ceb2a8596652a5e75740e3e99e84/config.yaml#L14-L36
Hmm, hitting a lot of weird errors. Not sure how best to make them reproducible. Not sure the Docker build is building everything as some things (like ../build/pdf.js
) seem to be missing.
ePub viewing is also failing, something weird happening as it seems to try to open up a container.xml
URL rather than get it from the ZIP. It creates a frame like:
/static/viewers/epubviewer/index.html?bookPath=http%3A//dev1.n45.wa.bl.uk%3A8209/doc/ir///staffaccess.dl.bl.uk/ark%3A/81055/vdc_100031420983.0x000001
Which looks about right, but then in that frame, rather than downloading the ePub, it requests:
http://dev1.n45.wa.bl.uk:8209/doc/ir_/staffaccess.dl.bl.uk/ark:/81055/META-INF/container.xml
which does not exist....
Possibly the ePub thing is this issue: https://github.com/futurepress/epub.js/issues/479
Ah, I see, pdf.js is in the token-auth branch. Unless you tell me otherwise, I'm going to assume it's okay to merge that branch in with this one....
Yes, sorry, that branch was the more up-to-date one! I should have opened a new PR for that
Ah well, I’ve added the Docker action to this branch so lets keep things here now.
My current problem seems to be that the URL to download the raw item to the viewer is also redirecting to the viewer. Any idea what I’m doing wrong?
EDIT: I noticed the form of the direct access URL is a bit different for each, eg ir_
vs id_
, if that matters?
Yeah, so I'm still struggling. the ePub viewer is refusing to download the ZIP, and the 'contained' parameter does not appear to be part of the ePub.js codebase any more, so I'm not sure what's going on there. This comment indicates the need for some ZIP libraries,but I've not managed to find anything else. I fear it may be necessary to mock-up a .epub extension on the URL!
And the PDF viewer is having a different problem. The viewer page attempts to load the content from:
http://dev1.n45.wa.bl.uk:8209/doc/ir_///staffaccess.dl.bl.uk/ark:/81055/vdc_100047810264.0x000001
But this is not getting the PDF. It's getting a redirect to:
/static/viewers/pdfviewer/web/viewer.html?file=http://dev1.n45.wa.bl.uk:8209/doc/ir///staffaccess.dl.bl.uk/ark:/81055/vdc_100047810264.0x000001
i.e. we're caught in a loop. Perhaps I messed up the config? (see it here).
Ah, seems like enable prefer = false
might be a problem?
Yay! That fixed PDF playback! ePubs still wonky, apparently due to lack of .epub file extension (!).
I can verify that if you hack it all around so that there's an .epub extension then it works fine. Sheesh. Behold, the HACK
https://github.com/ukwa/ukwa-services/commit/609bb73e53c9d74fb9e9160ac9cc0a9b82df1ab2
Although that's a horrible hack, it can probably stay, as I think long-term the best approach is going to be dynamically unpacking the ZIPs via a server-side helper (see ukwa/ukwa-services#93).
However, the current approach is fine for this first release.
The only actual problem I've found is, I think, a CSS issue, in that when in the playback iframe
and when printing, the ePub does not show at all (all you can see is a v long scrollbar). If you view the iframe alone, you can print the current page(s) fine.
Thanks for taking a closer look at this! Looks like there's a few things to discuss?
yes, I think disabling allow to application/
should be fine - that may have been a typo to leave that in
is the Prefer handling messing up the ir_
prefix? If so, disabling it should work, can also try to fix, if can find a repro. (It is enabled in the test collection)
If understand correctly, the epub viewer doesn't like files that don't have an .epub
extension so you're manually adding and removing it? Hmm, i guess the epubs I've tested with all had an .epub ext. The hack you have might be the best solution then!
There's still possible issue with printing and epubs (in the browser)? Yeah, perhaps its due to some security setting on the iframe that needs to be adjusted or something else..
Let me know if I missed anything!
The Prefer system is needed because this is how the server knows to pass the unmodified file back to the viewer:
It works fine now. The only issue I was wondering was why the PDF viewer appeared to use id_
whereas the ePub viewer used ir_
, but maybe I was wrong about that.
I think the iframe printing (which affects all playback) is the only outstanding problem.
In fact, I think I'll close this as done, and have a separate ticket about printing.
It works fine now. The only issue I was wondering was why the PDF viewer appeared to use
id_
whereas the ePub viewer usedir_
, but maybe I was wrong about that.
Hm, it should also be using ir_
, just checked. Should be getting set here: https://github.com/ukwa/ukwa-pywb/blob/custom-viewers/ukwa_pywb/ukwa_app.py#L251
The ir_
is just like id_
except the Location:
headers are rewritten, so makes it easier to link to (Perhaps that should have been how id_ worked initially but don't want to break backwards compatibility).
Minor config issue I think, but access an ePub from the live system got a download rather than a render. This is because the item has no
.epub
extension, but rather aapplication/epub+zip
content type. So that's just a config thing.But also, it got downloaded because everything
application/*
is allowed to download. Is that needed? Seems like it might let too much leak?