martinrotter / rssguard

Feed reader (and podcast player) which supports RSS/ATOM/JSON and many web-based feed services.
GNU General Public License v3.0
1.64k stars 125 forks source link

old feed as new feed #272

Closed danrobi11 closed 4 years ago

danrobi11 commented 4 years ago

Hello I often have old seen feed as new feed rss guard was doing that with older version as well. Im now testing new appimage 3.7.2 and same result. Theres nothing new in these feed, they shouldnt popup as new feed. The most obvious example here would be my personal reddit feed (comment, message) I keep getting older feed in these all the time

martinrotter commented 4 years ago

@danrobi11 I am not sure I completely understand what you try to say.

You say that some old (and possibly already read) messages gets by feed update marked as new/unread again? Can you post feed URL which contains this behavior + some screenshots perhaps?

danrobi11 commented 4 years ago

To reproduce Add your personnal Reddit comments, messages and the inbox. Im not posting my personnal feed in here reddit comments

martinrotter commented 4 years ago

@danrobi11 How did you manage to add your personal inbox? Doesn't it require some kind of authentication?

martinrotter commented 4 years ago

BTW, see error, investigating...

martinrotter commented 4 years ago

BTW, does error happen for all reddit feeds? If not, pls help me to setup my "personal inbox" or whatever.

martinrotter commented 4 years ago

BTW, it seems that those messages are NOT "duplicates", because they have different titles (and probably contents). Therefore RSS Guard cannot "merge" them. So far it seems as no bug in RSS Guard, this is normal RSS/ATOM behavior. You would need to write Reddit-specific RSS Guard plugin to support advanced nuances of Reddit infrastructure.

Let me know if I understood it bad, but at this point, I can close this.

guihkx commented 4 years ago

Perhaps an option could be added to ignore entries that have the exact same URL? That would be a good way to improve entry deduplication, I think. For example, this entry was published earlier today in a blog I follow:

image

Then, a few hours later, they've updated the title of the article to add new information (while keeping the exact same URL):

image

But RSS Guard created a new entry for this update.

martinrotter commented 4 years ago

Hello @guihkx.

RSS Guard identifies if message is the "same" via two mechanisms:

  1. Special message unique ID (this is the case for synchronized accounts - Inoreader, Nextcloud, Gmail, TT-RSS, where the service provides special unique identifier for each message and it is therefore very easy to determine the "same" message).
  2. Two message are considered "same" for normal feeds when they have equal title AND author AND url and they belong to the same feed of course.

Therefore if only URL is the same and title/author/contents are different then the message is added to RSS Guard's database and you see it in message list as new because it is not the same as already existing message. There are a NUMBER of feeds where the URL is the same and other fields are so much different that it is not the same message, for example many forum/threaded feed subscription. Therefore it is not that easy to write universal logic which will suit all use cases.

There is, however, a solution! Message filtering. You can write custom JavaScript filter to ignore all messages whose URL is already found in RSS Guard's DB. The script for you might look like this:

function filterMessage() {
  if (msg.isDuplicateWithAttribute(2)) {
    return MSG_IGNORE;
  } else {
    return MSG_ACCEPT;
  }
}

Or you can even improve the detection ACROSS all feeds. So if you receive message with url XXX in one of your feeds, then you can disable receiving other messages with same url XXX in all other feeds like this:

function filterMessage() {
  if (msg.isDuplicateWithAttribute(2 | 16)) {
    return MSG_IGNORE;
  } else {
    return MSG_ACCEPT;
  }
}

Make sure to read documentation carefully.

martinrotter commented 4 years ago

@guihkx Edited.

guihkx commented 4 years ago

Wow, you made such an amazing program! :)

I'm going to bed soon but I'll definitely give it a go tomorrow. Thanks!