orblivion / KiwixSandstorm

GNU General Public License v3.0
3 stars 1 forks source link

Make it deal with new OPDS catalog #2

Open kelson42 opened 5 years ago

kelson42 commented 5 years ago

Kiwix published now its catalog in OPDS format, see https://wiki.kiwix.org/wiki/OPDS

orblivion commented 5 years ago

Cool! The only problem is that Sandstorm still can't do outbound connections, last I checked. I'll keep this in mind, though.

On Tue, Dec 25, 2018 at 10:26 AM Kelson notifications@github.com wrote:

Kiwix published now its catalog in OPDS format, see https://wiki.kiwix.org/wiki/OPDS

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/orblivion/KiwixSandstorm/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/AACj-BQRoD12ksefwk1dZrbn2EizVBzDks5u8kO-gaJpZM4ZhIWN .

orblivion commented 4 years ago

At very least, I can try to call on this OPDS feed when I generate the package so that the content downloading step of the onboarding process has more up-to-date content to choose from. This means the catalog would be hard-coded into the package, so I would have to release a new version to update the catalog, but it's at least an improvement over the current situation (where I sort of manually picked out stuff).

I'll look more closely at this when I get around to it, but if this has some sort of web page generating facility that makes it look like library.kiwix.org, that would be even better. Assuming I could put some chrome around it to continue the onboarding process, I'd probably want to put that in the package as well.

kelson42 commented 4 years ago

We have started to push library.kiwix.org, so far mostly as a demonstration of the catalog, but we would like to improve the welcome page with better filtering/search capabilities and also providing a download link and widgets... All kind of things which would help you to build easily what you aim for.

kelson42 commented 1 year ago

library.kiwix.org and its OPDS stream are both stable meanwhile.

orblivion commented 1 year ago

Hi @kelson42 I was just thinking about this. I'd love to update Kiwix Serve, especially now that it's back in Debian, way easier to package. And I'd love to have your library downloading interface and get rid of my verbose home grown setup interface.

The problem that (I assume) will stop me in my tracks is that Sandstorm has security features that proxy all network connections. It proxies outgoing connections to stop malicious apps from "phoning home", but it gives the user a popup to explicitly allow it.

The problem is that the proxy is rather conservative about what it allows through. This may be for extra security in some way but I'm actually not sure about it. The specific problem I've run into is that it has a maximum request size, and it doesn't support range requests. This means that it can't download large files. Actually, the proxy for incoming requests (from the browser) has the same restriction. Part of my home grown setup interface for Kiwix on Sandstorm, where the user uploads zim files to their Kiwix grain, was to work around this by using POST variables instead of the usual header. This won't work for the Kiwix backend downloader (Unless you want to hack library.kiwix.org just for this! In which case we could do it. Though a simpler method may be to split the download into small chunks and have the downloader assemble them. I did this with my WIP OpenStreetMap app.).

But as I understand, range requests are only an issue because they haven't gotten around to whitelisting them yet. At least that's what I remember the incoming requests. @zenhack I don't suppose outgoing range requests are anywhere in the near priorities?

orblivion commented 1 year ago

(you may want to pull up the Github page rather than just reading the email; I made a few clarifying edits)

kelson42 commented 1 year ago

@orblivion Thank you for the update. I see here this is mostly a thing on your side (with Sandstorm) for both the catalogue and the ZIM file download itself. For the catalogue part, you can now:

Let me know if something is unclear.

orblivion commented 1 year ago

Sandstorm wouldn't let me embed an external page (again, security hardening). But I bet all of the API endpoints other than zim downloading would work just fine from the back end, since the responses would be small.

So, I could do a medium term fix. Right now my onboarding interface links to a few specific zim files as examples, and to the old catalog page for everything else. I could replace all of that with an interface that lets you search the live library, and read descriptions, and see thumbnails and all of that. But from that point it would have to work the same as now: the user would get presented with a download link they'd have to click, download the zim to their browser, and upload back to their Sandstorm grain.

Though, maybe I could let users download small zim files through the API, fwiw.

Long term fix would be downloading via the API once the range request thing is figured out.

zenhack commented 1 year ago

@orblivion, range requests aren't really on my list, no -- I'd be willing to advise if someone else wanted to add them. Step 1 would be to extend the schema in web-session.capnp with the necessary fields. The actual implementation for outgoing is the ExternalWebSession class in shell/imports/server/drivers/external-ui-view.js. We might also need to make changes to sandstorm-http-bridge to plumb the necessary fields through, unless you want to speak capnp directly.

Also a heads up if you haven't seen it re my own activity: https://zenhack.net/2023/01/06/introducing-tempest.html

orblivion commented 1 year ago

Ah okay, thanks. Given the low priority, I'll tentatively plan to implement it for Tempest then.