BiglySoftware / BiglyBT

Feature-filled Bittorrent client based on the Azureus open source project
https://www.biglybt.com
GNU General Public License v2.0
1.59k stars 155 forks source link

Problems enabling additional files in large, partially-completed torrents, again #3403

Open ferdnyc opened 1 month ago

ferdnyc commented 1 month ago

I'm once again seeing issues when a large-scale (thousands of files) torrent is Seeding in a partially-completed state, and additional files are then enabled for download in its "Files" tab.

What's happening now (different from #3218) is, when new files are checked in the torrent's Files tab, the torrent goes from Seeding to Stopped, and can't be started again until it's manually Force Re-checked. (Without the Force Re-check, attempts to "Queue" instantly reset to Stopped state; nothing additional gets downloaded.)

I've seen this happen several times now, enough that it's risen from "weirdness" to "reportable pattern".

I'll try to reproduce in a more controlled manner and capture logging, but I have to wait for the torrent in question to finish its post-completion low-resource Checking run first, following the downloads that precipitated the most recent occurrence... which, due to the size of the torrent (1788 files in 37,194 pieces totaling > 40 GB) takes somewhere between 20 minutes and an hour each time.

(While I'm currently seeing this issue with the version below, I think it may have started sliiiiightly before this particular beta build, though "probably" not more than 2 or 3 weeks ago. Rough guesstimate.)

System block

Java 11.0.24 (64 bit) Red Hat, Inc. /usr/lib/jvm/java-11-openjdk-11.0.24.0.8-2.fc40.x86_64

SWT v4956r12, gtk/3.24.43, zoom=100, dpi=96 Linux v6.10.12-200.fc40.x86_64, amd64 (64 bit) B3.7.0.1_B02/4 az3 en

parg commented 1 month ago

Nothing has changed that would affect/cause this in a long time, so it isn't 2-3 weeks I can guarantee

ferdnyc commented 1 month ago

Hm. Entirely possible it's been longer, come to think of it. I've only handled like 2 or 3 of these massive torrents in the past 8 months.

ferdnyc commented 1 month ago

Checking is at 62.5%, my "20 minutes to an hour" estimate was definitely too low. (Especially since it was second in line behind another big job.)

parg commented 1 month ago

For a download to go into a stopped state when started there would almost definitely be an error logged somewhere indicating why

parg commented 1 month ago

Out of interest do you have any of the "file name change" settings active? (Options->Files [File Extensions]: Add suffix to incomplete filesetc)

Also what kind of file allocation are you using? (Options->Files: Allocate and zero new files on creation etc)

ferdnyc commented 1 month ago

So, it happened again, but with an interesting twist that, while the torrent once again "stopped itself" when I started enabling additional files, and refused to start when I tried immediately after (I grabbed logs), I WAS ABLE to start it ~10 minutes later when I tried again. Without having to do a Force Re-check.

So, it's possible this is just a question of timing, and the re-checking is actually a red herring. Could be all it was really doing was delaying my next attempt to start the torrent long enough for it to work again.

Out of interest do you have any of the "file name change" settings active? (Options->Files [File Extensions]: Add suffix to incomplete filesetc)

I do, BiglyBT is configured to add .part to incomplete files. But an interesting twist there is that, for torrents of this size/filecount, that doesn't work — I don't think it ever has. While files in smaller torrents get renamed, the ones in these big torrents don't. (Meaning, they download with their "final" name even when incomplete; the partial files are never named foo.part.) I've noticed that in passing on occasion, but mostly just shrugged and moved on, to be honest.

Also what kind of file allocation are you using? (Options->Files: Allocate and zero new files on creation etc)

image

Here's the sanitized console output from the torrent, covering the point where I started enabling new files, it stopped itself, and then failed to start when I clicked "Queue". It's almost interesting how uninteresting the logs are, everything appears very normal. But there are a few somewhat-cryptic lines that I don't know how to interpret, maybe they're clues to what's going on. (The debug log had nothing whatsoever. Its ONLY contents for that entire period are occasional exceptions due to an unrelated torrent having an unreachable hostname set for one of its trackers.)

BiglyBT-wontstart-TorrentConsoleLog.txt

Possibly-interesting lines (or lines I just don't understand the meaning of):

[08:57:21.064] {peer} Peer connection closed: download stopped (code_out 2); | Download: 'TORRENTNAME'; Peer: L: A.B.C.D:PORT1 [libTorrent 0.13.7]

[08:57:22.435] {tracker} Forcing tracker announce now via stop (TRTrackerBTAnnouncerImpl.java:932), stop (TRTrackerAnnouncerMuxer.java:1280), informStopped (DownloadManagerImpl.java:4749);    | Torrent: 'TORRENTNAME'
[08:57:22.435] {tracker} Next tracker announce (unadjusted) will be in 10s;     | Torrent: 'TORRENTNAME'
[08:57:22.435] {tracker} Next tracker announce (adjusted) will be in 67s;   | Torrent: 'TORRENTNAME'
[08:57:22.435] {tracker} Canceling announce trigger;    | Torrent: 'TORRENTNAME'
[08:57:22.436] {core} Stopped - state=70,error=0//0;    | Download: 'TORRENTNAME'
parg commented 1 month ago

The limit on the .part being added is caused by "Options->Files: Maximum file links per download". The implementation of "link" isn't great which is why it is a bad idea to apply it when a torren has loads of files.

Unfortuantely there's nothing "aha!" in your logs/settings :( Generally when you change the "wanted" setting for a file the download needs to be stopped and restarted. For example, if you delete a file then all peers need to be disconnected as your availability has changed and there is nothing in the peer-protocol to indicate this. A smarter implementation could just disconnect the peers but the simplest solution is to stop and restart the download.

As far as I can see from the code it attempts to pause+resume rather than stop+start though. Pausing is the same as stopping along with a flag that says "I'm paused by the way" - there is no distinct paused state - so logging might well show "stopped". In the UI thought it would show "paused" as this checks the flag.

State 70 is stopped State 75 is queued

Perhaps next time you switch a file state you could keep your eye on the overall download state - if it automatically proceeds to stopped (or paused) then wait for a while without doing anythign to the download to see if it eventually restarts. Avoid changing multiple file states quickly in case there's interference.

I have tried to reproduce it but have so far failed.

ferdnyc commented 1 month ago

The limit on the .part being added is caused by "Options->Files: Maximum file links per download". The implementation of "link" isn't great which is why it is a bad idea to apply it when a torren has loads of files.

Ohhh. That makes sense, I didn't realize it was handled using links. (Funny thing is, THIS particular torrent is only 20 files over the line, though in the past I've had torrents with 4000+ files so obviously that'd be a total non-starter.)

Hmmm. I wonder if that's somehow related? Because, if there were originally only 20 files keeping this torrent from creating .part links, then once it's partially completed it WOULDN'T be over the cutoff anymore.

parg commented 1 month ago

Shoudn't be - as far as I know the renaming is only applied when the download is initially added

ferdnyc commented 1 month ago

OK, so, same torrent. I checked the enabled box next to exactly one previously Do-not-downloaded file, and the torrent instantly flipped to Stopped. (Definitely not Paused; full Stopped.)

It did log this warning, which I hadn't seen before:

[10:51:41.412] {core} Trying to set state to downloading when state is not seeding;

...Then it logged three peer connections closed, the same tracker stuff as my previous logs, "Stopped - state=70", "Stopped - state=75", etc.

Despite the warning message to the contrary, the state WAS previously seeding, though. Until it stopped itself.

ferdnyc commented 1 month ago

Starting it back up again after ~5 minutes was successful, once again.

parg commented 1 month ago

meh :( still no idea