ssbc / ssb-ebt

secure scuttlebutt replication with epidemic-broadcast-trees
MIT License
18 stars 10 forks source link

Db2 formats split #68

Closed arj03 closed 2 years ago

arj03 commented 2 years ago

This is https://github.com/ssbc/ssb-ebt/pull/67 converted to work with latest db2 https://github.com/ssbc/ssb-db2/pull/347 branch

staltz commented 2 years ago

@arj03 I just pushed a commit to formats-split to use WeakMaps in buttwoo feed format. Worth trying, might be important for performance.

arj03 commented 2 years ago

Without the previousKeyBFE check this was around 6800. This branch with db2 3947093 is 6300. This branch with db2 4e68371 (weakmap) is 6300-6500 so almost no change.

arj03 commented 2 years ago

I can do a deep dive on the performance a bit later. Yesterday I was mostly focused on getting to know the new code and getting the performance tests working again.

arj03 commented 2 years ago

It appears the performance difference is mainly focused on validate. ssb-buttwoo runs on data (already extracted data) and is passed in the prev msg key.

staltz commented 2 years ago

Ok, good feedback. I'll think about that. Buttwoo cut so many shortcuts, that it's hard to make a proper plugin architecture. But there may still be ways to optimize. I'm thinking that WeakMap should help with the case of previous msg key, because getFeedId has a cache. But if I remember correctly, validate isn't using the WeakMap to get the previous msg key, so that's one opportunity.

staltz commented 2 years ago

@arj03 Try out the latest formats-split branch. I used a WeakMap to cache the [encodedValue, signature, contentBuf] array. See commit https://github.com/ssbc/ssb-db2/pull/347/commits/ce718a57e4ebbff31ad0345a7a38cd47b7a7c1d3. Here's a before and after of the unit tests, showing how many times bipf.decode() is used to get that 3-item array.

Before

# add buttwoo-v1
-> extracting a989a86c504fe619
-> extracting a989a86c504fe619
-> extracting a989a86c504fe619
-> extracting a989a86c504fe619
ok 59 no err
ok 60 should be strictly equal
-> extracting a739e27a014e06e3
-> extracting a739e27a014e06e3
-> extracting a989a86c504fe619
-> extracting a739e27a014e06e3
-> extracting a739e27a014e06e3
ok 61 no err
ok 62 should be strictly equal

After

# add buttwoo-v1
-> extracting a989a86c504fe619
ok 59 no err
ok 60 should be strictly equal
-> extracting a739e27a014e06e3
ok 61 no err
ok 62 should be strictly equal
arj03 commented 2 years ago

@staltz I'm writing benchmarks in db2 now, so we can easier test the different parts.

staltz commented 2 years ago

Wow, quite a subtle optimization that was, appendOpts. Good find. This can only mean that GC was previously being triggered, causing other CPU work to pause. Do we have more of these GC pauses caused by frequent allocations?

arj03 commented 2 years ago

@staltz I noticed that while the hacky version of buttwoo was quite stable, this new formats-split can change +/- 10% between runs. Might be related to Weakmap?

staltz commented 2 years ago

@arj03 I have no idea, would have to take a close look. Sometimes GC can bring in some unpredictable behavior, because we don't control when does GC kick in. Might be that in one run there was no GC, while in another run there were a few GC.

arj03 commented 2 years ago

No worries, we can always circle back to that one, it's not super important right now. This repo is now up2date with the formats-split pr. Index tests are failing because meta-feeds doesn't seem to work well with the formats-split db2 branch.

arj03 commented 2 years ago

An update on this. Since last time there has been some optimizations to classic as well. So might be good to give some numbers from the previous branch (for classic this is how db2 was before the formats-split branch.

hacky butt2 branch

buttwoo: 7700
classic large: 5600
classic small: 8600

formats-split (this branch with ssb-buttwoo@0.3.0)

buttwoo: 6900
classic large: 6700
classic small: 9600

So classic got faster which is really nice. I remember seeing buttwoo with the formats-split branch at around 7200 before, still I consider the numbers good enough. We can always go back and see if we can optimize some more.

staltz commented 2 years ago

Hmm, some good news and some bad news. I think one thing I did in ssb-buttwoo recently was extractVal() thinking that it would be good for perf but it's also possible it made perf slower.

arj03 commented 2 years ago

An update on the performance. ssb-buttwoo@0.3.2 with latest bfe and uri tweaks is now at 7600. So somewhere between small and large classic just like the hacky branch was :sunglasses:

arj03 commented 2 years ago

There was one thing nagging my mind from the previous PR: Sending is a bit slower, so if you would only measure the receiving side, then the numbers would be better for butt2.

I modified the adding of messages a bit because this is basically what get if you just benchmark the receiving end, at least the muxrpc / secret handshake overhead should be the same. With that I measured how long it takes to add 15k messages and to make things fair I threw in a JSON.parse in the classic case.

buttwoo large: 2.3s
buttwoo small: 2.15s
classic large: 3.5s
classic small: 2.1s

The good news here is that if you onboard someone then buttwoo should be faster because you'll have a mix of smaller and larger messages in there.

arj03 commented 2 years ago

There seems to be something with debouncer here and then meta-feeds needs to be updated to work with the new db2 branch. This is why the index tests are failing.

staltz commented 2 years ago

Can you explain what's up with debouncing?

arj03 commented 2 years ago

I have not had time to debug it, you can try disabling it in add, then the tests run until the index test in formats

arj03 commented 2 years ago

Tests are passing now. I tested the latest version and performance is the same as the numbers above (just to be sure the latest changes didn't change anything). I would like to keep the benchmarking code in, I can move it to another file if that is better. Otherwise this should be ready for review.

staltz commented 2 years ago

Yeah, you can keep the benchmark, but preferably as a file, because commented pieces of code tend to stay commented forever.

Good job with all these commits.