alphapapa / org-web-tools

View, capture, and archive Web pages in Org-mode
GNU General Public License v3.0
647 stars 33 forks source link

Thank you for writing this package #29

Open mpereira opened 4 years ago

mpereira commented 4 years ago

Hey @alphapapa,

Thanks a lot for putting your time into this package. It's crazy useful. org-web-tools-insert-web-page-as-entry is a staple in my workflow.

Feel free to close this. 🙂

alphapapa commented 4 years ago

Hi Murilo,

Thanks for the kind words. It's always encouraging to hear that it's useful to someone. Would you mind sharing a bit about your workflow?

I used to use that function in a capture template to capture Web pages to read later, but after my articles.org file grew to several megabytes in size, I decided to start saving pages' original HTML as attachments using the org-web-tools-archive commands instead. It is less searchable, of course, although I wrote some simple code in a branch to search the archives, and I've also found that Recoll can work to index and search them quickly.

mpereira commented 4 years ago

I used to use that function in a capture template to capture Web pages to read later

My workflow is exactly that. Maybe I don't capture as many pages as you have and my articles.org isn't in the megabytes yet!

I decided to start saving pages' original HTML as attachments using the org-web-tools-archive commands instead

"Archival" is a use case I have in mind as well. For example, I'm currently apartment hunting. The webpages for the apartments might be disabled by the owners, making the webpages unavailable, while I'd still like to go over pictures, details and etc.

For that use case I've been using monolith from the shell. If I get some free time soon I'll work on an emacs lisp function to archive webpages as attachments to org headings using monolith. It's also handy for caching webpages locally for reading without internet, on flights for example.

alphapapa commented 4 years ago

Cool. Tools like Monolith are interesting, but storing all assets base64-encoded into the HTML has some serious drawbacks to me, so I'm sticking with zip/tar archives for now. Thanks for the feedback!

xvrdm commented 4 years ago

Thanks for the great library!

Did you consider using something like SingleFile ?

matiya commented 4 years ago

I also want to say thank you for this! It's has been a dramatic change to the way I consume websites.

Whenever I see a longish article on the web I read it as an org file to free myself from the distractions of a full browser. I never save the articles as I know I won't read them later, instead I start condensing information by deleting the parts that don't interest me. Finally I review what was left and possibly extract anki cards (via the amazing anki-editor with the content if I think it's something that might be useful in the future.

xvrdm commented 4 years ago

That’s a very interesting alternative to copy pasting parts of interest! Start from the whole of thing and keep only what matters : thanks for sharing!

alphapapa commented 4 years ago

Thanks for the great library!

Thanks.

Did you consider using something like SingleFile ?

I think I've seen it before. It appears to be only a browser extension, not something I could use from Emacs. If there were a way to run a shell command that caused SingleFileZ to be used to save a page to an archive, that might be useful. Of course, doing so within a browser (rather than using Wget or archive.today) raises issues of unwanted page content, ads, scripts, etc.

xuchunyang commented 4 years ago

a shell command that caused SingleFileZ to be used to save a page to an archive

The README mentions it has a command line tool as well https://github.com/gildas-lormeau/SingleFile#command-line-interface

alphapapa commented 4 years ago

a shell command that caused SingleFileZ to be used to save a page to an archive

The README mentions it has a command line tool as well https://github.com/gildas-lormeau/SingleFile#command-line-interface

Thanks, that's interesting. The setup is a bit much, and it uses Node.js, so I think I'll pass. But it may be useful to others.

xvrdm commented 4 years ago

Yes I only suggested it because it does have a CLI which I used with good results. But I understand if you aren’t too keen on the required stack.

I also pointed it to the author of org-board who might look into it.

elsatch commented 4 years ago

Thank you for all the effort into this package!

I've been following your steps, trying to use org-web-tools from org-capture. So far, I have not been successful. Every time I call any of the functions from org-capture I get a (wrong-number-of-arguments).

I've tried to use the %c as a parameter, called using interactive, but so far I haven't managed to get it working.

Do you have any org-capture template example available somewhere?

Thanks in advance!

alphapapa commented 4 years ago

@elsatch Here's one I use:

("cl" "Link to web page" entry
      (file+datetree "~/org/cpb.org")
      "* %(org-web-tools--org-link-for-url) :website:

%U %?" :clock-in t :clock-resume t :empty-lines 1)
elsatch commented 4 years ago

Thank you so much @alphapapa !

Eason0210 commented 4 years ago

Thanks you very much for this package, it very useful for me to convert webset to org.