Open ag91 opened 7 years ago
Hi there,
Thanks very much for the great package: I use it with org-feed and finally getting feeds content is much more reliable!
That's very interesting! I wasn't aware of org-feed. That is very interesting. So you use that as a feed reader, like instead of elfeed or something else? I'd never thought of that.
I have the feeling that this can be abstracted in a function format article-contents which defaults to your template, but that can be configured by the user.
Yeah, that makes sense. The code that manipulates the contents after insertion can be moved into a function and called with a hook.
Thanks very much for the time spent in this!
Thanks for your feedback! I will try to get to this soon. :)
Hi, yes, I do. I like to read on my ereader and with a bit of set up you can convert the org-feed file into an epub (or whatever you like). I very much appreciate elfeed, but I found easier to hack/extend org-feed with what I needed.
Thanks again for the work on this!
P.S:
The bit of my init that does that sets up org-feed (it is hacky -- I changed guid to be the article weblink):
(defun my/org-feed-parse-rss-feed (buffer)
"Parse BUFFER for RSS feed entries.
Returns a list of entries, with each entry a property list,
containing the properties `:guid' and `:item-full-text'."
(require 'xml)
(let ((case-fold-search t)
entries beg end item guid entry)
(with-current-buffer buffer
(widen)
(goto-char (point-min))
(while (re-search-forward "<item\\>.*?>" nil t)
(setq beg (point)
end (and (re-search-forward "</item>" nil t)
(match-beginning 0)))
(setq item (buffer-substring beg end)
guid (if (string-match "<link\\>.*?>\\(.*?\\)</link>" item) ;; we use the link instead as guid
(xml-substitute-special (match-string-no-properties 1 item))))
(message "%s" (concat "the guid-link is:" guid))
(setq entry (list :guid guid :item-full-text item))
(push entry entries)
(widen)
(goto-char end))
(nreverse entries))))
(defun my/org-feed-parse-rss-entry (entry)
"Parse the `:item-full-text' field for xml tags and create new properties."
(require 'xml)
(let ((guid (plist-get entry :guid)))
(with-temp-buffer
(insert (plist-get entry :item-full-text))
(goto-char (point-min))
(while (re-search-forward "<\\([a-zA-Z]+\\>\\).*?>\\([^\000]*?\\)</\\1>"
nil t)
(setq entry (plist-put entry
(intern (concat ":" (match-string 1)))
(xml-substitute-special (match-string 2))))
(setq entry (plist-put entry
:guid
guid)))
(goto-char (point-min))
))
entry)
(defun my/org-get-content-html-as-org (url)
"Returns the contents of URL as org mode without the heading"
(if (not (string-equal (file-name-extension url) "pdf")) ;; we exclude the download of pdfs because we do not need them
(condition-case err
(s-join "\n" (cdr (cdr (s-lines (org-web-tools--url-as-readable-org url)))))
(error (concat "Org-web-tools failed with: " (error-message-string err))))
"This was not a html page."))
(defun my/get-feed-content (new)
"Adds the contents of the article (grabbing the html page and
converting it to org) in the description of the feed."
(progn
(setq new-formatted
(mapcar
(lambda (e)
(progn
(setq article-contents
(org-get-content-html-as-org (plist-get e :link)))
(setq e1 (plist-put e :description article-contents))
(org-feed-format-entry e1 my-for-org-feed/tag-template nil)))
new))
(org-feed-add-items (point) new-formatted)))
(setq org-feed-alist
`(
("Hacker News"
"https://news.ycombinator.com/rss"
"/tmp/Feeds.org" "Hacker News"
:parse-feed my/org-feed-parse-rss-feed
:parse-entry my/org-feed-parse-rss-entry
:new-handler my/get-feed-content
))
I don't plan to work on this myself, but if someone else is interested in contributing it, I'll be glad to consider merging it.
Hello,
Thanks very much for the great package: I use it with org-feed and finally getting feeds content is much more reliable!
About the issue: currently downloading a link results in something like
For my use case I do not need the ** Article heading. This is enforced in
org-web-tools--url-as-readable-org
in this bit here:I have the feeling that this can be abstracted in a function
format article-contents
which defaults to your template, but that can be configured by the user. Something along the lines of:Would that make sense? For now I am using a modified version of
org-web-tools--url-as-readable-org
, but I really would like to not miss any future enhancement of this nice package :) Thanks very much for the time spent in this!