Novik / ruTorrent

Yet another web front-end for rTorrent
Other
2.03k stars 414 forks source link

RSS plugin should use guid for history management #2637

Open de666 opened 9 months ago

de666 commented 9 months ago

Please complete the following tasks.

Tell us about your environment

ruTorrent: v4.2.9

Tell us how you installed ruTorrent

using https://github.com/crazy-max/docker-rtorrent-rutorrent/releases/tag/4.2.9-0.9.8_2-0.13.8_2-r0

Describe the bug

Using rss plugin with feed which sometimes changes final link maintaining the same guid (like Jackett does), history is not managed the correct way because it refers to final download url (link element) instead of guid which (if present and valid) shuold be used as history reference because is unique and it doesn't change.

Steps to reproduce

  1. add rss feed from jackett
  2. load some torrent -> torrent are loaded and marked as loaded in the UI
  3. wait a couple of hours
  4. reload feed
  5. torrent previous loaded not marked as loaded anymore since link element has changed (for the same torrents already loaded)

Expected behavior

rss plgin shuold handle correctly the torrent already loaded. This bug causes also rssmanager is not working properly because if the torrent has been already loaded, completed and removed from active torrent list, it is loaded again.

Additional context

No response

TrimmingFool commented 9 months ago

@de666 Thanks for this!

Currently, the <link> is used to identify a rss item. However, according to the spec, using the <guid> as the item-identifer instead, seems to be correct: https://www.rssboard.org/rss-specification#hrelementsOfLtitemgt

I am not sure if only changing rRSSHistory will be enough, though: https://github.com/Novik/ruTorrent/blob/2d67a00b4fde90b4a652c130ed76ac2fc4383fb3/plugins/rss/rss.php#L301

If no one else wants to take this on, I think, I would, eventually.

stickz commented 8 months ago

@TrimmingFool Version 4.3 will be released in the near future. Would you like to sneak this important change in?

ranirahn commented 7 months ago

ss (2024-04-26 at 01 13 18) Is this related to this problem that same torrent gets added over and over again? This stops only when this torrent falls out of RSS feed finally.

de666 commented 7 months ago

Is this related to this problem that same torrent gets added over and over again?

I think so, that was the same issue I had and I came up to the conclusion that was the reason. Indeed it make sense when you have an RSS with links which change every time even if pointing to the same torrent hash.

ranirahn commented 7 months ago

Is this related to this problem that same torrent gets added over and over again?

I think so, that was the same issue I had and I came up to the conclusion that was the reason. Indeed it make sense when you have an RSS with links which change every time even if pointing to the same torrent hash.

I looked at my RSS feed closer and only thing I saw changing was the time. You had RSS that links changed, but i think that is not the only possibility. Most RSS feeds dont change the time for torrents, but this one does. Exact same torrent, but <pubdate> is new on every RSS update and it downloads it again. That ofcourse makes sense that if you have multiple torrents with the same name, but date is different then its possible that they are not the same, but hash is the same so i am not sure if that will fix my issue too if torrents get identified by guid. Will wait the fix for this and see if its fixed for me too.

ranirahn commented 6 months ago

I have looked this issue and I think change from link to guid will not fix my issue. Currently link, hash and timestamp is checked and if one of them is not the same then torrent is downloaded again. Some feeds have new timestamp in every feed update and torrents that have already been deleted will be downloaded again because time is changed. Even if torrents get identified by guid. Hash should be enough to indentify the content of the torrent and i dont think time or link needs to be checked again because data have not changed. It probably download torrents again and again even if torrent and data is not deleted but that does not show because torrent is already in torrents table and it gets merged into one. This also should not happen just because timestamp or link is different. Most importent thing is the content of the torrent and that is defined by hash. So i tried to think what would be the downside to get ride of link and timestamp from checking torrents history? I could not think of any. Hash should be enough to identify torrents that is already loaded and should not be downloaded again. Am i wrong?