citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.35k stars 656 forks source link

Azure Cosmos DB for Postgres #7357

Open double-dubs opened 9 months ago

double-dubs commented 9 months ago

Is there a technical reason you can only create worker nodes with the same specs or just the way it was decided to sell it on Azure?

There are use cases I can think of when you would want worker nodes of different sizes (you have a few large clients that need more processing, but you don't need all your nodes that way). You could create a larger node, isolate that client to that node, and leave your other nodes alone.

Another use case is a client is seasonal and only has really heavy usage once a year. Again, a large node could be created, move that client over, when they are done, then they get moved back to the shared nodes pool and that dedicated node is then removed from the cluster. It also seems to take away from the flexibility when you can't easily remove worker nodes either.

Overall the way Citus works in Azure seems a bit rigid for the goal of the cloud to be flexible and dynamically scalable. Appears build for workloads that are very predictable and static. Just wondering if there are any plans to make this more flexible in the future.

thanodnl commented 9 months ago

These issues are not monitored by the product team responsible for our managed services. It might be easier to open a ticket in the portal for better visibility to the Product team.

For now I have asked product to have a look at this issue so you might get an answer here.

niklarin commented 9 months ago

(thank you @thanodnl!)

@double-dubs thank you for the feedback! There're few things I see here:

Worker node removal as a self-service option is something we would like to add to the product relatively soon. The priority for this capability is driven by how often our current customers need it in real life scenarios. So far removal needs are quite infrequent.

Enabling full lifecycle management of worker nodes of different size in clusters of different size is one of the advanced scenarios. It would more be in the 'cost saving' area. To better understand priority for this one we're monitoring patterns on the existing clusters as well as engaging with potential customers to discuss where their priorities are. At this point this capability looks like a rather long-term addition but sometimes things change fast.

We've added some cost saving capabilities to Azure Cosmos DB for PostgreSQL in the last couple of years starting with the most demanded ones. Ability to stop and start cluster compute (you don't pay for compute while it is stopped) and introduction of burstable compute for single nodes are examples of such capabilities. All workloads have their own requirements but there're some patterns such as cyclic needs in increased cluster compute. So far such repeated cycles are quite long - daily or even weekly. We've seen some of the customers doing compute downscale (or stop) at the end of the day and upscale in the beginning of the day. All fully automated and routinely done on a daily basis.

We're certainly interesting in hearing opinions, so thank you very much once again for your feedback and sharing your thinking with us!

As far as asking these questions about Citus on Azure (Azure Cosmos DB for PostgreSQL), Nils is right, our team may miss feedback on our product here. The best way to communicate with us about future product development or ask questions about current capabilities is to reach out to Ask Azure Cosmos DB for PostgreSQL e-mail alias AskCosmosDB4Postgres@microsoft.com.