alanmcgovern / monotorrent

The official repository for MonoTorrent, a bittorrent library for .NET
https://github.com/alanmcgovern/monotorrent
MIT License
1.15k stars 397 forks source link

[Bug] StreamProvider.CreateStreamAsync garbage output due to memory cache #359

Closed Deathspike closed 3 years ago

Deathspike commented 3 years ago

I'm using MonoTorrent 2.0-rev0017 with StreamProvider and CreateStreamAsync. The stream doesn't guarantee that bytes read from disk actually have been written.

https://github.com/alanmcgovern/monotorrent/blob/master/src/MonoTorrent/MonoTorrent.Streaming/LocalStream.cs#L113 This waits and switches on the availability bit (piece downloaded).

https://github.com/alanmcgovern/monotorrent/blob/master/src/MonoTorrent/MonoTorrent.Client.PieceWriters/DiskWriter.cs#L146 This buffers writes in-memory for performance. Therefore, no flush to disk until a threshold is passed.

This seems to result in the LocalStream reading while the DiskWriter hasn't flushed (all) data yet. Therefore the stream is unreliable and produces garbage data, at times.

I can see two solutions right now:

Disable the cache Even then, there's still an error window because read/write isn't synchronized. If a read request request occurs during the window in which the writer is still writing, the reader will produce garbage, too. This makes me conclude that the same issue exists in older versions of monotorrent, except it's less pronounced, because there's less time that data isn't available on-disk.

Synchronize reader to the writer This is the robust option. The local stream should be hooked into the writer, and only be able to read data that has been flushed to disk or is available in memory. This is also the most complex option.

alanmcgovern commented 3 years ago

Oh interesting.

Let me double check the impl here. This was on my checklist of things but I clearly forgot to verify that this went through the DiskManager when reading bytes from disk.

alanmcgovern commented 3 years ago

I'm just going to test this out with my favourite big buck bunny stream :)

alanmcgovern commented 3 years ago

If you are comfortable building from git, can you check out this branch? https://github.com/alanmcgovern/monotorrent/pull/361

You can generate a nuget locally by running 'pack.bat' inside a visual studio developer command prompt. Otherwise let me know and i can attach a nuget here!

Deathspike commented 3 years ago

Yeah, I'm comfortable with git.

First observation: The flushes in LocalStream don't perform well. It can take a second or more for the call to finish, even if there is nothing to flush. The throughput is now so slow, even a local file won't play without stuttering.

Second observation: The flushes don't seem to have the desired effect. I can't tell for sure. A video stream gets a lot of artifacts now, possibly because it reads a block but doesn't get the next in time. I would have to write a unit test to compare data to be 100% sure. I'll do that tomorrow.

alanmcgovern commented 3 years ago

No worries! There are some easy perf improvements to be had (flush by piece index, not by file) before this would merge.

Let me run it through a profiler to see if there's an obvious place to tweak! I am somewhat surprised the video stream still has artifacts for you though. I was able to download a torrent using the Stream.CopyTo to copy the data straight from the local stream to disk, and the result was a bit-identical version of the file. I'll test that aspect some more too!

Deathspike commented 3 years ago

It's very possible the artifacts are on my end!

I'm piping the stream into LibVLC; it's possible that VLC shows artifacts when it's missing the next piece.

I would write a unit test to indeed to a bit-by-bit copy and see if that works. But if you already did that, there's no need. Or should I try anyway?

alanmcgovern commented 3 years ago

I made some further tweaks and I'm definitely pretty certain now that we're getting a bit-identical copy. The 'streamed' file passes the bittorrent hash check if i verify it after the fact.

There was an off-by-one issue which my previous test papered over, and which VLC also seemed to be papering over. In my tests with it, it seeked to the end of the file and then the start of the file, which meant that the stream was always in the correct position.

What i realised later was that the LocalStream object was being returned after 1 byte was read (Position = 1 instead of Position = 0) so it's a likely cause of the issue you describe.

Definitely do double check the data is bit-identical just in case there are other issues I've missed. I'll need to add a few more tests covering these cases before merging the PR.

The latest changes in https://github.com/alanmcgovern/monotorrent/pull/361 should be better!

Deathspike commented 3 years ago

Looks good! Had a moment during my break, so I tested e1bc5af6bd3bb0f65475e7688222fcdf3f15f7d3 and I see that performance is excellent. Both for from reading start-to-finish and when seeking around in a stream. Everything the stream passes back seems match what I expect, too. I verified this a bunch of times using a stream copy and a CRC check against the source file. Great work 👍

alanmcgovern commented 3 years ago

Brilliant! I'll finesse a few tests to cover the new behaviours and merge this to master.

alanmcgovern commented 3 years ago

I've added tests covering the new cases and merged it all to master.

I'll issue a new alpha release next week I'd say, so if you have time to test a master build in the meantime please do!

Deathspike commented 3 years ago

Seems to work just fine!