googleapis / sdk-platform-java

Tooling and shared libraries for Cloud SDK for Java
https://cloud.google.com/java/docs/bom
Apache License 2.0
64 stars 51 forks source link

GAX-JAVA: Allow users to customize ChannelPoolSettings RESIZE_INTERVAL and MAX_RESIZE_DELTA #2746

Open lkleeven opened 4 months ago

lkleeven commented 4 months ago

We are creating our own in company Google Pub/Sub client. To improve performance we've recently switched to using the InstantiatingGrpcChannelProvider. This already gave a performance improvement, however to be able to deal with big spikes of load, we'd like to be able to also set the resize interval and max resize delta, currently present as package private static fields in ChannelPoolSettings. This would allow us to set clients up to respond faster to big spikes in load.

We've considered keeping a higher minimum channel count open, but that feels rather wasteful, especially if the spikes are few and far between.

We are willing to contribute if that helps to speed things up. And are also open to suggestions on how to otherwise deal with these load spikes from a client perspective.

blakeli0 commented 3 months ago

Hi @lkleeven, thanks for reporting this! I think your request is reasonable, but we may not expose these two settings at this moment. Because

  1. Channel resizing is expensive and could be error prone if misconfigured. The settings you mentioned were not exposed on purpose because we want the channel resizing process to have minimum impact on the normal operations, as vast of majority of clients are good with just 1 channel. If RESIZE_INTERVAL is exposed and it is set to something like 5 seconds, it may actually hurt the performance.
  2. Channel pool was designed for moderate spikes(~2x), not big spikes(10x). If there is a big spike, I would first consider optimizing in other ways like adding more machines. If you are stuck with limited machines and the big spike is predictable, one possible way you can do in code is to create another client with a different number of channels.

That being said, if you still think this feature is of great value, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response and proper prioritization.

lkleeven commented 3 months ago

Hi @blakeli0, thanks for the reply. I understand that these are settings that can have a big impact and users should be careful using it. But I'd think that could be addressed by adding the appropriate warnings in the documentation/Javadoc.

Now you mention that it's not meant to be used for to catch bigger spikes and those should be handled with more machines or adding another client. But if we add another client, won't we be also creating those channels which are expensive? I ask because scaling up with more machines isn't always as easy for our users and we've heard that they usually still have enough power on their machines to add more clients/channels.

blakeli0 commented 3 months ago

if we add another client, won't we be also creating those channels which are expensive?

Yes creating channels are expensive, and it is unavoidable either we resize channels in the same client or create a new client. But the process that checks for resizing is not a cheap operation either, which may hurt the performance if configured too frequent. Also the additional client can be closed after the spike is done, which should help free up the resources.