devsisters / shardcake

Sharding and location transparency for Scala
https://devsisters.github.io/shardcake/
Apache License 2.0
389 stars 30 forks source link

Question about performance tuning #72

Closed BaekGeunYoung closed 1 year ago

BaekGeunYoung commented 1 year ago

Hello, I'm trying to migrate my project from akka-cluster to shardcake.

I finished migrating my code, and I'm comparing the overall performance of akka & shardcake, using K6.

I expected that there would not be any big difference in performance between these two library, but it seems that the performance when I use akka is about 4~5x better than when I use shardcake.

Is there any configuration that I should adjust further, or is this just shardcake's structural problem?

K6 test result

shardcake version:

image

akka version:

image

Infrastructure & Configuration about shardcake

ghostdogpr commented 1 year ago

In our case, both in load tests and production, performance was pretty much the same between Akka and Shardcake. So it may come from other factors/changes but it's hard to tell without knowing the code. Out of curiosity, were you using zio with akka? I would try to use profiling or tracing to find where the bottleneck is.

BaekGeunYoung commented 1 year ago

Wasn't there any bottleneck in communicating between pods via gRPC? I think performance of remote actor communication would be much better in akka, because they are using just TCP.

ghostdogpr commented 1 year ago

gRPC is pretty fast, though not as fast as direct TCP. If you only measure the transport and your actors do nothing else, you might see a difference but in our case of a real world actor that does stuff, the difference was not significant. We had the exact same latency and throughput before and after.

Note that if the transport layer speed is sensitive, you can implement your own by making your own implementation of the Pods interface instead of using GrpcPods.

BaekGeunYoung commented 1 year ago

@ghostdogpr i've tried tracing about my RPC, and below is the result:

image

The span shardcake-actor-service-execute is created when calling Messenger.send, and finished afther receiving Response from entity. And its children spans are created after entity receives command from outside. As you see, the durations of children are trivial, and "something" is occupying dominant portion of shardcake-actor-service-execute span. I think this is a duration of waiting for queueing or something like that.. Do you have some idea about what makes this latency, or how to reduce this latency?

ghostdogpr commented 1 year ago

Hmm that definitely looks wrong. Is this real production code? No way you can share it by any chance? I would look at CPU profile (if CPU usage is high) and threads (see if threads are blocked or if something blocking runs in ZIO async threadpool).

BaekGeunYoung commented 1 year ago

I think https://github.com/devsisters/shardcake/pull/73 might have solved my issue. After applying 2.0.6+11-e9d97295-SNAPSHOT, the performance got much better! I'll close this issue. thanks!

ghostdogpr commented 1 year ago

I think #73 might have solved my issue. After applying 2.0.6+11-e9d97295-SNAPSHOT, the performance got much better! I'll close this issue. thanks!

Ohh that was it? I discovered recently with our own load test with zio 2 that zio-grpc had a severe performance issue, and that it was fixed in the latest snapshot. We're using zio 1 in prod so I wasn't aware of this at the time you sent the first message. Glad that it's resolved!