nictuku / dht

Kademlia/Mainline DHT node in Go.
Other
826 stars 144 forks source link

replace jackpal/bencode-go with IncSW/go-bencode #58

Closed btkador closed 6 years ago

btkador commented 6 years ago

jackpal/bencode-go currently uses a lot of ressources..

Is it maybe easy to use IncSW/go-bencode instead?

I've tried for a while but couldn't get it into the right format/syntax, to get it to work.

Performance

Marshal

Library Time Bytes Allocated Objects Allocated
IncSW/go-bencode 1493 ns/op 554 B/op 15 allocs/op
jackpal/bencode-go 8497 ns/op 2289 B/op 66 allocs/op

Unmarshal

Library Time Bytes Allocated Objects Allocated
IncSW/go-bencode 3151 ns/op 1360 B/op 46 allocs/op
jackpal/bencode-go 6850 ns/op 3073 B/op 102 allocs/op
nictuku commented 6 years ago

I'd like to use whatever Taipei Torrent is using so we should change both, maybe. @jackpal

jackpal commented 6 years ago

What is the use case where bencode performance matters enough to make this change worthwhile?

nictuku commented 6 years ago

There are a number of people (myself included) who run large scale DHT nodes that process thousands of packets per second. @btkador got some profiles showing bencode decoding is a top consumer of CPU cycles, unsurprisingly.

I'm in favor of improving performance, but I don't know if switching to the proposed repo is a good choice. Their tests don't seem as thorough as @jackpal's and jackpal has a long history of maintaining his projects. Perhaps it makes sense to improve jackpal's to reduce allocations?

jackpal commented 6 years ago

I copied IncSW's benchmark test into go-bencode.

Looking through the code, I made an improvement to Unmarshalling:

Old: BenchmarkJackpalBencodeUnmarshal-8 300000 4136 ns/op 3056 B/op 99 allocs/op

New:

BenchmarkJackpalBencodeUnmarshal-8 500000 3417 ns/op 2593 B/op 72 allocs/op

I'll look at making Decode() run faster

jackpal commented 6 years ago

I copied incSW's implementation of Decoding into go-bencode and then adapted it to go-bencode's API. The differences are:

1) go-bencode takes an io.Reader as a source of encoded data, rather than a []byte 2) go-bencode decodes strings as golang string objects, rather than []byte

The benchmark after these changes were incorporated is:

Old (before any optimization today): BenchmarkJackpalBencodeUnmarshal-8 300000 4136 ns/op 3056 B/op 99 allocs/op

New:

BenchmarkBencodeUnmarshal-8 50000 2879 ns/op 1728 B/op 64 allocs/op

I think that's as far as I can optimize while keeping the public API the same.

nictuku commented 6 years ago

That's awesome! @btkador in this case I think we can close this issue. The remaining performance improvements could perhaps be done by adding/replacing new methods with a different API to jackpal/go-bencode but I think that's overkill given our simple use cases.