y500 / libtorrent

Automatically exported from code.google.com/p/libtorrent
0 stars 0 forks source link

Torrents remains at 99% complete but continues to download and never finishes #283

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
I have just seen one of my torrents endlessly downloading and although this 
issue has been seen before by myself and other forum users, it seems quite rare 
(certainly with 0.15) and unable to be reproduced.

I found the torrent in Deluge in the following state: 
Size: 348MB
Progress: 0.5% 
Downloaded: 2.2GB
Active Time: 1d7h

After a force recheck the progress percentage reset to 0%. 

I manually verified the torrents single video file and it looked to be complete 
so the percentage returned by force recheck is definitely wrong.

Restarted Deluge, performed another force recheck and the progress was reported 
as 99.34%. After this restart the torrent completes successfully.

This torrent progress percentage seems to match what other users are reporting 
where it manages 99% actual torrent progress (reported progress in Deluge 
varies wildly from this) but then after this point the torrent never completes 
and continues to download. 

libtorrent 0.15.9

Reference topics from Deluge: 
http://forum.deluge-torrent.org/viewtopic.php?f=7&t=37427
http://forum.deluge-torrent.org/viewtopic.php?f=7&t=35849
http://dev.deluge-torrent.org/ticket/1174

Original issue reported on code.google.com by caluml...@gmail.com on 26 Jan 2012 at 2:06

GoogleCodeExporter commented 8 years ago
gentoo linux for both seed and peer.  It is a very large file (18GiB) that I'm 
trying to send (this is a requirement for my project).  Note that this error 
does not always happen... It's when it does happen that it gets "stuck" and 
doesn't complete that's a problem.  

Original comment by cariser...@gmail.com on 10 Sep 2013 at 11:32

GoogleCodeExporter commented 8 years ago
I think the best way to make progress on figuring out what's happening is to 
know exactly which operation fails, and what the actual error code is.

would it be possible for you to strace the seeder? you can filter to just see 
file operations. You may get multiple log-files, one for each thread.

Alternatively, you could set a breakpoint in disk_io_thread.cpp at all places 
where set_error() is called with error::file_too_short argument, and see which 
one triggers.

The error message most likely is caused by one of those, which in turn are 
caused by a read() call returning fewer bytes than what was requested. From the 
man page:

"Upon successful completion, read(), readv(), and pread() return the number of 
bytes actually read and placed in the buffer.  The system guarantees to read 
the number of bytes requested if the descriptor references a normal file that 
has that many bytes left before the end-of-file, but in no other case."

So, I imagine this could be caused by the filesystem not providing that 
guarantee for some reason, or a bug in libtorrent where the last piece of a 
file is thought to be larger than it is.

If you force-recheck the torrent on the seeder, does it complete to 100%?

Original comment by arvid.no...@gmail.com on 11 Sep 2013 at 5:17

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
got same bug 
linux mint 16 with deluge 
downloads more then required and i have to force recheck

Original comment by sunny...@gmail.com on 21 Mar 2014 at 4:33

GoogleCodeExporter commented 8 years ago
@sunny.ej: would you be willing to help out fix this? if so, read previous 
posts and provide logs and/or more information.

Original comment by arvid.no...@gmail.com on 22 Mar 2014 at 1:47

GoogleCodeExporter commented 8 years ago
Oddly just had this happen after an upgrade to an SSD hd and 14.10 from 14.04 
LTS Ubuntu.

Original comment by natecap...@gmail.com on 31 Oct 2014 at 3:56

GoogleCodeExporter commented 8 years ago
I take it you had a lot of wasted / failed data. probably, lots of peers are 
corrupting some piece(s) without their torrent client realizing it, and just 
keep seeding corrupt data. I've seen this be especially common with mp3 files 
where some media players automatically scan directories and "fix" (i.e. 
corrupt) id3 tags.

Original comment by arvid.no...@gmail.com on 31 Oct 2014 at 5:53

GoogleCodeExporter commented 8 years ago
Maybe, it was just a single 50 meg rar file that wouldn't download correctly no 
matter how many times I paused, forced re-checked, restarted deluge.  I even 
thought something might be going on with the SSD so I ran the full smartctl 
scan.

I ended up just remoting to another machine w/utorrent on windows and snagging 
it there with no problems.  I tried it back at the deluge (after utorrent on 
windows worked fine) and it's still screwing up.

Original comment by natecap...@gmail.com on 31 Oct 2014 at 6:13

GoogleCodeExporter commented 8 years ago
I have the same problem with Deluge 1.3.10 (libtorrent 0.16.18).

My OS is: Linux note 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) 
x86_64 GNU/Linux

The client is seeding a 92 GB torrent (540 files not larger than 256 MB) from 
NTFS partition (ntfs-3g 1:2014.2.15AR.2-1) and after about 5 to 15 minutes of 
normal work it just stops with the following error message: "file too short: 
reading p: 19 b: 480 s: 16384 (read: 4096)". It is possible to continue seeding 
simply by clicking the "Resume" button or by restarting the client with no need 
to rehash or readd the torrent.

It seems that this problem only persists when libtorrent's cache settings 
(disk_io_read_mode & disk_io_read_mode) are set to "disable_os_cache". I tried 
to reset them to default values and Deluge has been seeding with no problems 
ever since. The tweaking was necessary for me because the system file cache 
eats up too much memory, pushing out all other files and programs, and my 
system becomes a little too unresponsive.

I have emailed you a few strace dumps. I tried to exit the application right 
after the error and there were no more active torrents, so I guess the probable 
cause lies somewhere in the last few lines.

Sadly, I do not currently have enough time to build the debug version of the 
library, but I will try to do it in a day or two.

Original comment by uka...@gmail.com on 17 Dec 2014 at 8:18

GoogleCodeExporter commented 8 years ago
Interesting. Historically the problem of the system disk cache bloating and 
intruding on overall system performance has been windows specific. Tied to the 
FILE_FLAG_RANDOM_ACCESS ( 
http://blog.libtorrent.org/2012/05/windows-disk-cache/ ).

My experience is that linux has a reasonable strict priority of pages in the 
page cache.

Original comment by arvid.no...@gmail.com on 17 Dec 2014 at 8:36

GoogleCodeExporter commented 8 years ago
ukawtf: btw, the problem you describe doesn't sound like it match the title of 
this bug report. However, it sounds like you are experiencing two separate bugs 
(both of which I would like to fix).

1. when disabling OS-cache, you get the "file too short: " error.
2. when leaving the OS-cache enabled and seed a very large torrent, you en up 
with a sluggish system, which appears to be caused by the page cache from 
reading those files intrude into other more critical pages, like running 
processes etc.

I'm a little bit more interested in fixing (2), because I would like to 
deprecate and remove the option to disable OS-cache (i.e. the use of O_DIRECT).

But let's start with bug 1. (if my description of the symtoms you see are 
correct, we may want to split these into separate tickets btw).

Would you mind trying flipping the coalesce_reads setting and see if that makes 
a difference? (just in order to narrow down where this error happens). Another 
thing to test would be to disable libtorrent's read disk cache and see what 
happens. There are exactly 3 places in the code that can produce this error.  
Those two settings control which one of them can be triggered.

as for bug 2. What tools do you use and what output do you see that indicates 
the problem is caused by a bloated filesystem read cache?
I take it you're sure it's not libtorrent's own disk cache that's getting too 
larger, right?

Original comment by arvid.no...@gmail.com on 18 Dec 2014 at 3:53

GoogleCodeExporter commented 8 years ago
btw. in none of the traces you sent me do I see any mention of O_DIRECT. I 
would have expected to if you actually ran with disable_os_cache, and if that 
was the cause of the problem.

Original comment by arvid.no...@gmail.com on 18 Dec 2014 at 4:53