anacrolix / torrent

Full-featured BitTorrent client package and utilities
Mozilla Public License 2.0
5.42k stars 617 forks source link

panic upload rate limiter #759

Closed AskAlexSharov closed 2 years ago

AskAlexSharov commented 2 years ago

trace:

panic: upload rate limiter burst size < 2097152
goroutine 19871 [running]:
github.com/anacrolix/torrent.(*PeerConn).upload(0xc119a58900, 0xc090bd5b58)
    github.com/anacrolix/torrent@v1.44.0/peerconn.go:1582 +0x3b9
github.com/anacrolix/torrent.(*PeerConn).fillWriteBuffer(0xc119a58900)
    github.com/anacrolix/torrent@v1.44.0/peerconn.go:704 +0x8f
github.com/anacrolix/torrent.(*PeerConn).startWriter.func1()
    github.com/anacrolix/torrent@v1.44.0/peer-conn-msg-writer.go:24 +0xa6
github.com/anacrolix/torrent.(*peerConnMsgWriter).run(0xc119a58d20, 0xdf8475800)
    github.com/anacrolix/torrent@v1.44.0/peer-conn-msg-writer.go:69 +0xde
github.com/anacrolix/torrent.(*PeerConn).startWriter.func3()
    github.com/anacrolix/torrent@v1.44.0/peer-conn-msg-writer.go:40 +0xdd
created by github.com/anacrolix/torrent.(*PeerConn).startWriter
    github.com/anacrolix/torrent@v1.44.0/peer-conn-msg-writer.go:36 +0x1ec

my upload burst: 4 * 1024 * 1024

another similar panic:

panic: upload rate limiter burst size < 16384
goroutine 59848 [running]:
github.com/anacrolix/torrent.(*PeerConn).upload(0xc0a4075200, 0xc03c945b58)
    github.com/anacrolix/torrent@v1.44.0/peerconn.go:1582 +0x3b9
github.com/anacrolix/torrent.(*PeerConn).fillWriteBuffer(0xc0a4075200)
    github.com/anacrolix/torrent@v1.44.0/peerconn.go:704 +0x8f
github.com/anacrolix/torrent.(*PeerConn).startWriter.func1()
    github.com/anacrolix/torrent@v1.44.0/peer-conn-msg-writer.go:24 +0xa6
github.com/anacrolix/torrent.(*peerConnMsgWriter).run(0xc0a4075620, 0xdf8475800)
    github.com/anacrolix/torrent@v1.44.0/peer-conn-msg-writer.go:69 +0xde
github.com/anacrolix/torrent.(*PeerConn).startWriter.func3()
    github.com/anacrolix/torrent@v1.44.0/peer-conn-msg-writer.go:40 +0xdd
created by github.com/anacrolix/torrent.(*PeerConn).startWriter
    github.com/anacrolix/torrent@v1.44.0/peer-conn-msg-writer.go:36 +0x1ec
AskAlexSharov commented 2 years ago

it may happen when not enough RAM

anacrolix commented 2 years ago

Thank you, I believe I know the cause and fix for this. It's triggered by performance increases in uploads in v1.45.0. You can use the previous version temporarily until I add the fix.

anacrolix commented 2 years ago

I didn't realise you were using v1.44.0 and seeing this. Is that correct?

anacrolix commented 2 years ago

Could you check https://github.com/anacrolix/torrent/blob/fd8995dcfd192a5678e329cb5688b5978382c422/global.go#L51? It's most likely that peers are requesting unexpectedly large chunks. There should be a potential memory issue but you're not triggering that here. It should also be possible to put a check here: https://github.com/anacrolix/torrent/blob/32cdaf4adad1102c951e222ad1137d3e2c02eb89/peerconn.go#L1007 that request lengths do not exceed ~128KiB, or your write burst size.

anacrolix commented 2 years ago

I'm working on some tests and fixes for this.

AskAlexSharov commented 2 years ago

After fix, I see much

*torrent.PeerConn 0xc0605c4d00 [id="-GT0003- @\xc2>դr\x9f\xe1\x99m\xc8", exts=0000000000100005, v="github.com/ledgerwatch/erigon (devel) (anacrolix/torrent unknown)"]: peer requested chunk too long (2097152)

and

*torrent.PeerConn 0xc009ac4d00 [id="-GT0003-tv\xea\xe4x\xd8\xe1,\x94\xaf)\xbb", exts=0000000000100005, v="github.com/ledgerwatch/erigon (devel) (anacrolix/torrent unknown)"]: peer requested chunk too long (16384)

My settings (and settings of previous versions):

const DefaultPieceSize = 2 * 1024 * 1024 // file piece
const DefaultNetworkChunkSize = DefaultPieceSize
torrentConfig.UploadRateLimiter = rate.NewLimiter(rate.Limit(uploadRate.Bytes()), 2*DefaultNetworkChunkSize)
torrentConfig.DownloadRateLimiter = rate.NewLimiter(rate.Limit(downloadRate.Bytes()), 2*DefaultNetworkChunkSize) 

Means requested chunk 16384 is smaller than burst size 2*DefaultNetworkChunkSize

anacrolix commented 2 years ago

Are you setting ChunkSize anywhere, to anything other than the default? Are both uploader and downloader running erigon? It seems like you have the chunk size set to the piece length, which probably isn't what you want to do.

anacrolix commented 2 years ago

Yeah I can see that is what's happening: https://github.com/ledgerwatch/erigon/search?q=DefaultNetworkChunkSize. I recommend leaving the ChunkSize alone unless you know that it improves performance to go higher (and even then you will likely go up doubling each time and only sparingly).

AskAlexSharov commented 2 years ago

Yes, i set to all torrents chinkSize to DefaultNetworkChunkSize.

Actually I wan’t increase even further: chunkSize to 2*pieceSize

My target: reduce amount of random reads from seeder, reduce amount of network requests.

Soon we will serve 300 files ~1tb total

anacrolix commented 2 years ago

Reducing random reads seems like a reasonable reason. In that case, maybe try setting your chunk size to something that minimizes overlap in files, and is reasonably large. Note that large chunks reduce responsitivity in the protocol, control and data messages share the same stream.

AskAlexSharov commented 2 years ago

Do I need to do something about https://github.com/anacrolix/torrent/issues/759#issuecomment-1166741622

anacrolix commented 2 years ago

I don't think there's anything actionable there: I suspect you have some clients that have burst sizes taht are smaller than your other clients chunk sizes (maybe old ones?). You could change the log level to something lower, but I'm not sure how you can get that message if all your clients are set up correctly. Let me know if there's some way I can reproduce it more accurately.