autodl-community / autodl-irssi

A community-driven fork of autodl-irssi
https://autodl-community.github.io/autodl-irssi/
374 stars 73 forks source link

download-duplicates=false (default behavior) sometimes doesn't works #106

Closed hardhouse closed 6 years ago

hardhouse commented 8 years ago

When trackers announcing at about the same time same release, autodl sometimes matched all those releases:

18:08 -!- Irssi: Matched Main_Line_10-Stickem_Up-WEB-2012-ENTiTLED in xxx 18:08 -!- Irssi: Saved torrent Main_Line_10-Stickem_Up-WEB-2012-ENTiTLED, tracker1, total time 0.846 seconds 18:08 -!- Irssi: Matched Main_Line_10-Stickem_Up-WEB-2012-ENTiTLED in xxx, pre'd 45 seconds ago, tracker2 18:08 -!- Irssi: Saved torrent Main_Line_10-Stickem_Up-WEB-2012-ENTiTLED, tracker2, total time 6.480 seconds

OR

18:02 -!- Irssi: Matched Architects-All_Our_Gods_Have_Abandoned_Us-WEB-2016-ENTiTLED in xxx, pre'd 39 seconds ago, tracker1 18:02 -!- Irssi: Saved torrent Architects-All_Our_Gods_Have_Abandoned_Us-WEB-2016-ENTiTLED, tracker1, total time 0.501 seconds 18:03 -!- Irssi: Matched Architects-All_Our_Gods_Have_Abandoned_Us-WEB-2016-ENTiTLED in xxx, tracker2 18:03 -!- Irssi: Saved torrent Architects-All_Our_Gods_Have_Abandoned_Us-WEB-2016-ENTiTLED, tracker2, total time 77.692 seconds

ps max-saved-releases=1000, save-download-history=true

thebigmunch commented 8 years ago

What you've posted doesn't really help at all in debugging this. And any testing I've done in a test channel can't replicate it. This would be pretty hard to figure out if something is actually happening since I don't even use a combination of trackers where I could test under real world conditions.

For those reasons, I'm closing this as a 'won't fix' issue. If you or someone else figures out a cause and solution for this, feel free to reopen this issue or create a new one with the information.

hardhouse commented 8 years ago

The problem in the delay between matching torrent and downloading. In MatchedRelease.pm you downloading torrent first (and it may take some time) and THEN call "$self->_addDownload()". Pseudo-script: 00:00:00 tracker1 matches torrent1 00:00:01 tracker1 is downloading torrent1 00:00:02 tracker2 matches torrent1 since tracker1 still downloading torrent1, "_canDownload" returns true for tracker2 00:00:03 tracker2 is downloading torrent1 00:00:03 bug happens

so i think you should call "$self->_addDownload()" IMMEDIATELY after matching torrent

thebigmunch commented 8 years ago

So, most of the delay isn't between matching and downloading, it's in downloading the torrent itself and checking the size. Moving the _addDownload call before this might be possible, but not ideal since we'd have to add another check to reset it if necessary. But it should probably be moved right after it instead of having them in the upload action subroutines. Another possible helper for this is to move (or add another) duplicate check later in the flow.

These are mostly just notes for me, but, if anyone feels like experimenting, let me know the results.

BigEd01 commented 8 years ago

I have 3 trackers I monitor for TV shows. Always been 1st come 1st serve. But recently they have been hitting these servers at the same time and I would get 2-3 torrents of the same. Not so much on the 3's (rare) but 2 of the same from different trackers of at least 3-4 shows a night. I wouldn't mind but if one of them had a different NFO or SFV file it would cause the 1st legit one to now be corrupt and go red and stop in rutorrent until I got rid of the 2nd one and did a recheck on the 1st one losing any seedback ratio I could of gotten. I ended up having to disable all but one tracker. Hopefully this can be corrected soon. Very seldom did the 1st torrent end up being bad so I would prefer to see _addDownload called right after the 1st match and prior to the download of actual files.

Bakkra commented 7 years ago

I can confirm that this is still happening, I have like 2-3 consecutive downloads if they're uploaded with 2-3 seconds apart on each tracker. I could really appreciate a fix, I'm sure that's annoying for many people.

Thank you!

thebigmunch commented 7 years ago

Whoops, forgot to post this link for testing. Just overwrite the existing file. This moves the _addDownload call to just after checking the size. I'm not sure how much this might help, but that's the point of testing.

This change already breaks some safety as it adds the download before the upload action is successful. But I don't think there's a viable solution without there being some kind of edge case on one end or the other.

Bakkra commented 7 years ago

Thank you, testing right now and will report back.

BigEd01 commented 7 years ago

Testing also. One day so far with two top trackers and no duplicates came through or any other problems seen. Will take a few more heavy days (Saturdays are quiet) but I'll also report back.

Bakkra commented 7 years ago

Everything looks fine until now, but to be fair I didn't had a lot of activity over the weekend.

BigEd01 commented 7 years ago

Ok, after some testing I am still getting some Dupes through. I recorded the time stamps of the match and save which had the same timestamp each. Saturday had 1 13:54:40 torrent 1 13:54:41 torrent 2 Duplicate Sunday had 1 19:59:18 Torrent 1 19:59:18 Torrent 2 Duplicate. Monday was a quiet night. Tuesday had 3 dupes come through 17:52:45 Torrent 1 17:52:46 Torrent 2 Duplicate 18:58:39 Torrent 1 18:58:40 Torrent 2 Duplicate 19:59:23 Torrent 1 19:59:31 Torrent 2 Duplicate

As they come in I repost them elsewhere. Each time they got fubar on my upload as the second site would have a slightly different files (sample folder and sfv file) which overwrote the original one. 1 second isnt that much but the last one had 8 seconds apart. Also, I just copied in the test file you provided but didnt restart the system or rtorrent. I will try a rtorrent restart tomorrow and see if any difference in the next few days.

Bakkra commented 7 years ago

@BigEd01 I think you need to restart

I'm using it for almost 1 week and it's looking fine.

BigEd01 commented 7 years ago

Update: After 3 good nights (and after the reboot) it looks like all is good. Had 2 duplicates come through, one at a 11 sec lapse, the other at a 59sec lapse. Much, much better than before. Will continue to monitor this.

BigEd01 commented 7 years ago

ok, Back to only one grabbing from one site at a time. This helped but not enough. Still getting duplicates coming through. Most seem to be 1-2 seconds apart but some are up to 1 minute apart which is I'm assuming to a slow download. My log file shows "Match so-so site X" followed by another same "match so-so site Y". Most of the downloads end up getting erred out because one may have a different sfv with a sample folder or file while the other site eliminated these, same exact torrent name though. It seems it isnt adding to the match file fast enough or on the actual match. I wouldn't mind an immediate write to log and deal with failures afterwards which doesnt happen all to often anyway.

Bakkra commented 7 years ago

Still seeing some duplicates but not that much as before. Do we have another solution to fix this permanently?

Thanks!

thebigmunch commented 7 years ago

I really should have taken the bug label off this, as it's not actually a bug. This is rather just a limitation imposed by reality. The change made is pretty much the extent of what can reasonably be done in autodl-irssi. There may be some larger rewrite that could eek out some more leeway, but it's probably not something I'd spend much time on. And, it would never remove the possibility for the use cases that it affects anyway. There will always be some kind of delay between announces and determining if they match, so there will technically always be a chance of this happening. The other option brought up by someone in this thread about adding an announce as matched much earlier before it's finally matched would just present the problem in reverse, the first potential match could block an actual match that is announced before the first is fully done processing. And that would broaden the effect to more than just download-duplicates filters.

hardhouse commented 7 years ago

still exists:( 04:26:44 -!- Irssi: Matched CiscoKid-_Pizzaman-WEB-2016-iDC in Music/MP3 (Music/MP3), xxx 04:26:44 -!- Irssi: Started command (...) CiscoKid-_Pizzaman-WEB-2016-iDC, xxxx, total time 1.238 seconds 04:26:44 -!- Irssi: Matched CiscoKid-_Pizzaman-WEB-2016-iDC in Music (Music), yyy 04:26:44 -!- Irssi: Started command (...) CiscoKid-_Pizzaman-WEB-2016-iDC, yyy, total time 0.901 seconds

BigEd01 commented 7 years ago

And this will continue to happen... It has been adjusted down as much as it can be for the time being...

And as thebigmunch has said it would take a big re-write to try and correct this action. Until the 1st one does a successful download those others coming in are valid game. In your case as it is for the rest of us these same time stamped matches, or even a few seconds off, will be a problem.