kronusaturn / lw2-viewer

An alternative frontend for LessWrong 2.0
https://www.greaterwrong.com/
MIT License
61 stars 6 forks source link

Add export to epub #23

Open NightMachinery opened 5 years ago

NightMachinery commented 5 years ago

There are some efforts that scrape the site into an ebook (e.g., https://github.com/AABoyles/LessWrong-Portable, or general webpage to ebook convertors), but they don’t export the comments. (Unless we save the whole site as html and convert it to epub, which will have a lot of unwanted cruft.) They are also rather brittle.

It’d be so much better if the site natively supported exporting to epub.

Another idea is to have a stripped down, minimal html view (something like the print version of Wikipedia). I find this the better option, since it is suitable for printing and can be easily converted to epub by just saving the whole html.

achmizs commented 5 years ago

Is there something wrong with the existing print view?

NightMachinery commented 5 years ago

I can’t find the existing print view😅

On Mon, Sep 16, 2019 at 10:54 AM Said Achmiz notifications@github.com wrote:

Is there something wrong with the existing print view?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kronusaturn/lw2-viewer/issues/23?email_source=notifications&email_token=AIUL56T6CC2GFXWST4SRWZDQJ4RCHA5CNFSM4IW2J45KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6YGRJY#issuecomment-531654823, or mute the thread https://github.com/notifications/unsubscribe-auth/AIUL56XFE4DM2OTNLGQEAC3QJ4RCHANCNFSM4IW2J45A .

kronusaturn commented 5 years ago

Our HTML is already pretty minimal -- I tried saving a post and converting it to EPUB using Calibre and it seems to work pretty well. There is a bit of cruft at the beginning but it's only a few lines.

achmizs commented 5 years ago

I can’t find the existing print view😅

I am simply referring to the fact that using the Print feature of your browser uses the CSS specified for the print media type, i.e., it uses a special layout/styling/etc. for printing. (This is quite common.)

Once you’ve got that Print dialog open, you can save the file as a PDF, with the print layout, using your operating system’s print-to-PDF feature. (At least, the Mac OS has this, and, I think, Windows also?)

And a PDF can easily be converted into an epub…

NightMachinery commented 5 years ago

Hmm. PDFs generally don’t get converted to EPUBs nicely. Using browsers is also a no-no, I need the solution to be scriptable. I currently just run a bash function named tl and give it a list of URLs to create a book and send it to my Kindle.

On Tue, Sep 17, 2019 at 10:36 AM Said Achmiz notifications@github.com wrote:

I can’t find the existing print view😅

I am simply referring to the fact that using the Print feature of your browser uses the CSS specified for the print media type, i.e., it uses a special layout/styling/etc. for printing. (This is quite common.)

Once you’ve got that Print dialog open, you can save the file as a PDF, with the print layout, using your operating system’s print-to-PDF feature. (At least, the Mac OS has this, and, I think, Windows also?)

And a PDF can easily be converted into an epub…

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kronusaturn/lw2-viewer/issues/23?email_source=notifications&email_token=AIUL56TGCYLA7NPTWMUGQY3QKBXWNA5CNFSM4IW2J45KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD63M2TQ#issuecomment-532073806, or mute the thread https://github.com/notifications/unsubscribe-auth/AIUL56SK3EGLGB7UDMHYIZTQKBXWNANCNFSM4IW2J45A .

jimrandomh commented 5 years ago

This is on the medium-term roadmap for LessWrong. The main issue is that the print view of a single post isn't what you want, you generally want a sequence, or a list of posts. So this will probably follow after a reading-list feature.

NightMachinery commented 5 years ago

@jimrandomh There are generalized tools that can do the sequence of pages part; E.g., https://epub.press/. What we need is just a nice, clean view of the content to feed into these.

NightMachinery commented 2 years ago

I don’t remember what my issue was at the time, but I have been converting GW pages to EPUB using pandoc/calibre (and Mozilla Readability, though it’s not essential) without a hitch for many months. So I am closing the issue, thanks.

kronusaturn commented 2 years ago

This still seems like it would be nice to have, along with an OPDS catalog to make it easy to browse directly from an ereader.