dps / remarkable-wikipedia

MIT License
146 stars 4 forks source link

Implement a zim file set up and download UI #19

Open dps opened 3 years ago

cowpod commented 3 years ago

If it helps, I can set up an api that periodically parses the kiwix zim downloads section. All that would be necessary is fetching a page, which gives a json reply. Otherwise, they probably have an api already; their kiwix android app source is here: https://github.com/kiwix/kiwix-android

dps commented 3 years ago

Thanks for the offer... That would be really great. There's a bit more to it - they unfortunately don't have a very stable format for the markup inside the zim files so I think I need something that chops the kiwix boilerplate (which changes) off the front of each page...

cowpod commented 3 years ago

I've completed an API in PHP; it scrapes the kiwix wiki page's table values, which returns JSON data organized into an object array. It refreshes data once per day.

A downside of using this is the risk of the server going down, but upside is offloading quite a bit of parsing work from the reMarkable; parsing a massive table takes a while. Ironically my server is currently down with errors 500 and 503, but I'll upload it when I can, and then you are welcome to use the hosted API directly.

API use: https://rmwk.api.snorfi.us/?sort=category, where category can also be language/size/date/type/download

dps commented 3 years ago

Thanks - I appreciate the work on this. BTW HTTPS won't work with snorfi.us as it's presenting a certificate for jaserhamdan.online I'd probably prefer hosting this myself (I can) to ensure good uptime and given the certificate issues.

On Sun, Mar 14, 2021 at 7:35 PM Henry G.H @.***> wrote:

I've completed an API in PHP; it scrapes the kiwix wiki page's table values, which returns JSON data organized into an object array, by an argument. It refreshes data once per day.

A downside of using this is the risk of the server going down, but upside is offloading quite a bit of parsing work from the reMarkable; parsing a massive table takes a while. Ironically my server is currently down with errors 500 and 503, but I'll upload it when I can, and then you are welcome to use the hosted API directly.

API use: https://rmwk.api.snorfi.us/?sort=category/language/size/date/type/download

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dps/remarkable-wikipedia/issues/19#issuecomment-799049815, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZ6K5CZWDO77KCVNS4QNDTDVXALANCNFSM4ZDRFDAQ .

cowpod commented 3 years ago

It's got a weird certificate due to the hosting issues - normally it's fine; I've got my own certificate.

I've attached the latest file below. It needs PHP modules php-curl and php-dom. It also needs file_put_contents and file_get_contents enabled. I can also probably get it working without php-curl, and with sqlite/mysql instead of flat files - let me know if you want that.

I've put it under an MIT license as well; seems to be the freest. I did my best to sanitize the $_GET values, however obviously there's no warranty/guarantees with it.

Edit:

Fixed the inconsistent formatting. Sorting by size is still a bit wonky - it doesn't understand the K/M/G suffixes. index_v2.zip

cowpod commented 3 years ago

I've switched hosts to one that's hopefully more reliable - If you still need it, the API is available at https://api.snorfi.us/rmwk/?sort=download. It's also cached with cloudflare, so in the event of downtime it should be okay.

dps commented 3 years ago

Thanks - it is producing download links with extra backslashes in them e.g.:

"https:\/\/download.kiwix.org\/zim\/wikipedia_en_simple_all_maxi.zim"

rather than

"https://download.kiwix.org/zim/wikipedia_en_simple_all_maxi.zim"

On Mon, Mar 15, 2021 at 11:20 PM Henry G.H @.***> wrote:

I've switched hosts to one that's hopefully more reliable - If you still need it, the API is available at https://api.snorfi.us/rmwk/?sort=download. It's also cached with cloudflare, so in the event of downtime it should be okay.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dps/remarkable-wikipedia/issues/19#issuecomment-799986793, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZ6K7YR2RMM4DM6WO3PJDTD32DFANCNFSM4ZDRFDAQ .

cowpod commented 3 years ago

Reposting my edit, as I see that you use email:

Fixed! uploaded to my site and attached here; index.zip