fasheng / elfeed-protocol

Provide extra protocols to make like Fever, NewsBlur, Nextcloud/ownCloud News and Tiny Tiny RSS work with elfeed
GNU General Public License v3.0
100 stars 18 forks source link

Handling duplicate entries #60

Closed ArtemSmaznov closed 1 year ago

ArtemSmaznov commented 1 year ago

Hi, based on the info in other issues that I've read through this more of a question than a bug report

As many I am was using elfeed-org to manage my RSS feeds on emacs and found this when I was searching for a solution of syncing it to my phone. It works great for the most part but I can't seem to figure out how others are solving the issue with duplicate entries when using elfeed-protocol.

I have exported my elfeed.org to my NextCloud server so it would have the same folder structure

1. When I just enable elfeed-protocol on top of my existing setup I get duplicate entries for each article. One from elfeed-org and one from Nextcloud

2. I have added :ignore: tag to the head of my elfeed.org file to stop getting feeds from it and only see the ones from Nextcloud and it worked, but no feeds have any tags (I don't think the ones from Nextcloud had any tags even before I added :ignore:)

Either option is making it quite unpleasant to use elfeed on the desktop. Is this a known limitation or am I missing something in my config? Here is my section for elfeed in my Doom Emacs config:

(use-package! elfeed
  :defer t
  :init
  (map! :leader
        :prefix "o"
        :desc "RSS News" :e "n" #'elfeed)
  :hook
  (elfeed-search-mode . elfeed-protocol-enable)
  (elfeed-search-mode . elfeed-update)
  :config
  (map! :mode elfeed-search-mode
        :localleader
        :desc "Toggle logs" :n "l" #'elfeed-goodies/toggle-logs
        :desc "Update"      :n "u" #'elfeed-update)
  (elfeed-set-timeout 36000)
  (setq
   elfeed-use-curl t
   elfeed-search-date-format '("%d-%m-%Y" 10 :left)
   elfeed-search-filter "@1-month-ago +unread"))

(use-package! elfeed-org
  :after elfeed
  :config
  (setq
   rmh-elfeed-org-files (list (expand-file-name "elfeed.org" org-directory))
   rmh-elfeed-org-tree-id "elfeed"
   rmh-elfeed-org-ignore-tag "ignore"))

(use-package! elfeed-protocol
  :after elfeed
  :config
  (setq elfeed-protocol-tags nil)
  (defadvice elfeed (after configure-elfeed-feeds activate)
    ;; "Make elfeed-org autotags rules works with elfeed-protocol."
    (setq
     elfeed-protocol-tags elfeed-feeds
     elfeed-feeds (list
                        (list (password-store-get-field "tools/elfeed-protocol" "url")
                              :password (password-store-get "tools/elfeed-protocol")
                              :autotags elfeed-protocol-tags))))
  (setq
   elfeed-protocol-enabled-protocols '(owncloud)
   elfeed-protocol-owncloud-star-tag 'star)
  (elfeed-protocol-enable)
  )

Would appreciate any feedback on this if there is a solution to dupes/tags problem.

PS

I don't understand quite a bit about elfeed so maybe I just need to delete the database and load it from scratch to solve one of these problems? Really didn't want to do that without good reason as I don't want to lose the starred articles I already have.

fasheng commented 1 year ago

Hi, sorry for the late reply. But I don't know why you will fetch articles from both elfeed-org and nextcloud, do you setup elfeed-feeds with two sources? If you follow the README's example code about defadvice elfeed, you should only fetch from nextcloud. You could check variable's content of elfeed-feeds' after elfeed loaded.

ArtemSmaznov commented 1 year ago

Hi @fasheng I have checked the README once again and updated my config (was just renaming variables as far as I see so seems to me like it was set up as instructed).

I have left 1 feed unignored in my elfeed.org and I still get duplicate entries for that feed. Both are tagged though this time. I have marked 1 as read from nextcloud and now emacs has 1 read and 1 unread so it still gets feeds from 2 sources and I don't quite get why.

inspecting elfeed-feeds var after loading elfeed I see this:

(("owncloud+https://user@mydomain.com/nextcloud" :password "password" :autotags
  (("https://www.gamingonlinux.com/article_rss.php" linux tech game))))

I guess that is how it is supposed to look? Any idea what else might be pulling duplicated entries in my config? Please refer to the full one above. Or maybe it can be something with how I am loading the 3 packages where it only loads elfeed-protocol after the sources are pulled from elfeed.org?

Here is the part that I updated with the rest of the config remaining the same:

(use-package! elfeed-protocol
  :after elfeed
  :config
  (defvar elfeed-protocol-orig-feeds nil
    "Store original content of `elfeed-feeds'.")
  (defadvice elfeed (after configure-elfeed-feeds activate)
    "Make elfeed-org autotags rules works with elfeed-protocol."
    (setq
     elfeed-protocol-orig-feeds elfeed-feeds
     elfeed-feeds (list
                   (list (password-store-get-field "tools/elfeed-protocol" "url")
                         :password (password-store-get "tools/elfeed-protocol")
                         :autotags  elfeed-protocol-orig-feeds))))
  (setq
   elfeed-protocol-enabled-protocols '(owncloud)
   elfeed-protocol-owncloud-star-tag 'star)
  )
fasheng commented 1 year ago

If you setup defadvice elfeed correctly like the example, your elfeed-feeds will own only one feed(prefix like owncloud+https), all your feeds in elfeed-org will be ignored, the defdvice just override them and use elfeed-protocol-orig-feeds as :autotags.

I'm not sure what's wrong with you, maybe you could check the elfeed-log buffer to see if other feeds beside owncloud+https be fetched.

Besides, even you setup correctly, the old entries will keep their, you must delete them manually. So you should fetch latest entries to check if articles duplicated again.

ArtemSmaznov commented 1 year ago

That's pretty much what I am trying to figure out - what I have set up wrong. Elfeed log goesn't really give anything useful:

[2023-09-09 12:10:32] [info]: elfeed-org loaded 1 feeds, 0 rules
[2023-09-09 12:10:32] [info]: Elfeed update: September  9 2023 12:10:32 EDT
[2023-09-09 12:10:46] [info]: Elfeed update: September  9 2023 12:10:46 EDT

Would you be able to advice me on how can I delete entries manually? Do I just need to find and remove entries in the database file or is there a better way?

ArtemSmaznov commented 1 year ago

I have just traced the value of elfeed-feeds through the loading of the packages and I can see that just before defadvice is run it's value is

elfeed-protocol pre-defadice:((https://www.gamingonlinux.com/article_rss.php linux tech game))

and then it updates to the one I mentioned before. Does this look like the cause of the duplicates? It actually resets every time I load elfeeds - added some messages to Messages buffer to trace it

elfeed config:
elfeed-org config:
elfeed-protocol pre-config:
elfeed-protocol post-config:
elfeed-protocol pre-defadice:((https://www.gamingonlinux.com/article_rss.php linux tech game))
elfeed-protocol post-defadice:((owncloud+https://user@mydomain.com/nextcloud :password password :autotags ((https://www.gamingonlinux.com/article_rss.php linux tech game))))
elfeed-protocol pre-defadice:((https://www.gamingonlinux.com/article_rss.php linux tech game))
elfeed-protocol post-defadice:((owncloud+https://user@mydomain.com/nextcloud :password password :autotags ((https://www.gamingonlinux.com/article_rss.php linux tech game))))

does that make sense to you? Last 2 lines are for when I relaunched elfeeds buffer

fasheng commented 1 year ago

ISSUE_TEMPLATE shows how to change elfeed-log-level:

(setq elfeed-log-level 'debug)

https://github.com/fasheng/elfeed-protocol/blob/master/.github/ISSUE_TEMPLATE.md

Looks defadvice works well. Your duplicate entries should be only the old entries. Why not just ignore them for it will not happen again since now.

I have no idea how to delete the old duplicated entries manually, maybe you could ask in upstream elfeed repo.

BTW: I wrote a helper function to toggle show entries for special feed, hope it helps https://github.com/skeeto/elfeed/issues/216

fasheng commented 1 year ago

Looks someone have opened similar issue: https://github.com/skeeto/elfeed/issues/392

ArtemSmaznov commented 1 year ago

Thanks for the resources. I have enabled debug logging and removed last 2 entries from my feeds. And marked that entry as unread on nextcloud.

When I launch elfeed I get 1 entry for that article with the following in the log:

[2023-09-09 14:05:13] [info]: elfeed-org is set up to handle elfeed configuration
[2023-09-09 14:05:16] [info]: elfeed-org loaded 1 feeds, 0 rules
[2023-09-09 14:05:16] [info]: Elfeed update: September  9 2023 14:05:16 EDT
[2023-09-09 14:05:16] [debug]: retrieve (https://www.gamingonlinux.com/article_rss.php)

if I do elfeed-update then a second entry for it appears with the following in the log:

[2023-09-09 14:07:36] [info]: Elfeed update: September  9 2023 14:07:36 EDT
[2023-09-09 14:07:36] [debug]: retrieve (https://mydomain.com/nextcloud/index.php/apps/news/api/v1-2/feeds)
[2023-09-09 14:07:36] [debug]: elfeed-protocol-owncloud: found 14 feeds
[2023-09-09 14:07:36] [debug]: elfeed-protocol-owncloud: update entries with action update-since-time, arg 1694282477
[2023-09-09 14:07:36] [debug]: retrieve (https://mydomain.com/nextcloud/index.php/apps/news/api/v1-2/items/updated?type=3&lastModified=1694282477)
[2023-09-09 14:07:36] [debug]: elfeed-protocol-owncloud: parsing entries, first-entry-id: 752 last-entry-id: 752 last-modified: 1694282477
[2023-09-09 14:07:36] [debug]: elfeed-protocol-owncloud: parsed 1 entries(1 unread, 0 starred) with 0.000545s, first-entry-id: 752 last-entry-id: 752 last-modified: 1694282477

So I still get duplicate entries. Is it maybe something with this part not working? I am not sure if me requiring to call elfeed-update manually is expected

  :hook
  ((elfeed-search-mode . elfeed-protocol-enable)
   (elfeed-search-mode . elfeed-update))
ArtemSmaznov commented 1 year ago

Looks like I am getting somewhere removing the hook and adding (elfeed-protocol-enable) in the end as in documentation seems to working. Just need to figure out where can I add (elfeed-update) now. Let me mess a bit more with my config and I will close the issue afterwards Thanks for all the info you provided @fasheng - that was very helpful

ArtemSmaznov commented 1 year ago

Ok can confirm that the core of the issue lied in me trying to move (elfeed-protocol-enable) and (elfeed-update) to hooks. As soon as I move them to more closely match the instructions it started looking good and I am now only getting single feeds from nextcloud which are tagged as per my elfeeds.org

If anyone stumbles upon this in the future I know it is hard to find full examples so here is my full elfeed setup for doom emacs:

Elfeed

(use-package! elfeed
  :defer t
  :init
  (map! :leader
        :prefix "o"
        :desc "RSS News" :e "n" #'elfeed)
  :config
  (map! :mode elfeed-search-mode
        :desc "Remove Selected" :n "D" #'my/elfeed-search-remove-selected

        :mode (elfeed-search-mode elfeed-show-mode)
        :localleader
        :desc "Toggle logs" :n "l" #'elfeed-goodies/toggle-logs
        :desc "Update"      :n "u" #'elfeed-update)

  (elfeed-set-timeout 36000)
  (setq
   elfeed-log-level 'info
   elfeed-goodies/log-window-position 'left
   elfeed-goodies/wide-threshold 0.3
   elfeed-goodies/show-mode-padding 1
   elfeed-goodies/entry-pane-size 0.5
   elfeed-goodies/feed-source-column-width 20
   elfeed-use-curl t
   elfeed-search-date-format '("%d-%m-%Y" 10 :left)
   elfeed-search-filter "@1-month-ago +unread")

  (defun my/elfeed-db-remove-entry (id)
    "Removes elfeed entry for given ID"
    (avl-tree-delete elfeed-db-index id)
    (remhash id elfeed-db-entries))

  (defun my/elfeed-search-remove-selected ()
    "Remove selected entries from elfeed database"
    (interactive)
    (let* ((entries (elfeed-search-selected))
           (count (length entries)))
      (when (y-or-n-p (format "Delete %d entires?" count))
        (cl-loop for entry in entries
                 do (my/elfeed-db-remove-entry (elfeed-entry-id entry)))))
    (elfeed-search-update--force)))

Elfeed Org

(use-package! elfeed-org
  :after elfeed
  :config
  (setq
   rmh-elfeed-org-files (list (expand-file-name "elfeed.org" org-directory))
   rmh-elfeed-org-tree-id "elfeed"
   rmh-elfeed-org-ignore-tag "ignore"))

Elfeed Protocol

(use-package! elfeed-protocol
  :after elfeed
  :config
  (defvar elfeed-protocol-orig-feeds nil
    "Store original content of `elfeed-feeds'.")
  (defadvice elfeed (after configure-elfeed-feeds activate)
    "Make elfeed-org autotags rules works with elfeed-protocol."
    (setq
     elfeed-protocol-orig-feeds elfeed-feeds
     elfeed-feeds (list
                   (list (password-store-get-field "tools/elfeed-protocol" "url")
                         :password (password-store-get "tools/elfeed-protocol")
                         :autotags  elfeed-protocol-orig-feeds)))
    (elfeed-update))
  (setq
   elfeed-protocol-enabled-protocols '(owncloud)
   elfeed-protocol-owncloud-star-tag 'star)
  (elfeed-protocol-enable))

thanks once again @fasheng - would not have figured this out without your info. And as you can see I have incorporated the resources that you have share.