Azure / azure-sdk-for-go

This repository is for active development of the Azure SDK for Go. For consumers of the SDK we recommend visiting our public developer docs at:
https://docs.microsoft.com/azure/developer/go/
MIT License
1.64k stars 842 forks source link

Azure Comsos DB. Stored Procedures support #22409

Open vaskravchuk opened 8 months ago

vaskravchuk commented 8 months ago

Feature Request

I see that Stored Procedures are not supported, could someone elaborate, are this going to be implemented? There is already same request but still not sure if it's going to be implemented maybe SDK has some other example/approach(except batches) to solve this?

ealsur commented 8 months ago

Hi. Could you describe what's the use case for Stored procedures that cannot be fulfilled with TransactionalBatch?

vaskravchuk commented 8 months ago

Hi. Could you describe what's the use case for Stored procedures that cannot be fulfilled with TransactionalBatch?

Delete documents based on sql condition. Or at least delete all in partition. Also there is possible stored procedures which just replace in-code logic, like documents tree building , to avoid several calls to cosmos

ealsur commented 8 months ago

Delete documents based on sql condition

Would this simply be a query and a TransactionalBatch with delete operations?

delete all in partition.

This is something that even a Stored Procedure might not be the right fit. Stored procedures have a limited run time, if the volume of documents you are deleting is too large, the execution will be aborted. Why is a Stored Procedure required for this?

Also there is possible stored procedures which just replace in-code logic, like documents tree building , to avoid several calls to cosmos

Stored procedures have several drawbacks, including versioning and update flows. The idea is to move the logic outside of Sprocs and into the application layer, which can be versioned, rolled-back or patched, it a way easier way. TransactionalBatch supports Patch and Replace operations.

vaskravchuk commented 8 months ago

Delete documents based on sql condition

Would this simply be a query and a TransactionalBatch with delete operations?

Yes, it's true, can be moved to app level, but it leads to increasing amount of calls to cosmos. It's maybe not so big deal of course... specially when app in the same datacenter. But if it's supported by API why not to use it?

Stored procedures have several drawbacks, including versioning and update flows. The idea is to move the logic outside of Sprocs and into the application layer, which can be versioned, rolled-back or patched, it a way easier way. TransactionalBatch supports Patch and Replace operations.

Does it means that stored procedures is deprecated in cosmosDB? We already have some logic there which works good in .net app, and it's easier just to reuse it than migrate

ealsur commented 8 months ago

Yes, it's true, can be moved to app level, but it leads to increasing amount of calls to cosmos.

The complexity to handle continuations from a Stored procedure (you need to execute the query, then consume the pages, then pass the query continuation to the caller because the total execution time might hit the Sproc limit) is arguably more complex than implementing this as a query + TransactionalBatch. TransactionalBatch is a single request. I am not sure there is a great difference in volume of requests.

But if it's supported by API why not to use it?

The benefits of TransactionalBatch vs Sprocs are several and the guidance for new applications would be to use TransactionalBatch: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/transactional-batch?tabs=dotnet#transactional-batch-operations-and-stored-procedures, unless there is a specific unsupported case.

Does it means that stored procedures is deprecated in cosmosDB?

@jcocchi are you aware of any deprecation related to Sprocs?

vaskravchuk commented 8 months ago

Yes, it's true, can be moved to app level, but it leads to increasing amount of calls to cosmos.

The complexity to handle continuations from a Stored procedure (you need to execute the query, then consume the pages, then pass the query continuation to the caller because the total execution time might hit the Sproc limit) is arguably more complex than implementing this as a query + TransactionalBatch. TransactionalBatch is a single request. I am not sure there is a great difference in volume of requests.

Firs of all, thanks a lot for your answers. We can replace simple logic with TransactionalBatch, it can be easy.

But still, question is open... Is it in roadmap to add support of Stored Procedures to golang SDK in near future?

ealsur commented 8 months ago

Not in the short term but we can use this issue as a feature request and the team can evaluate it.

The main question here is what is the gap in terms of the scenario that cannot be done with TransactionalBatch. According to the documentation, TB is even more performant. If you can help shed some light on the gap, I guess that would help the team prioritize.

vaskravchuk commented 8 months ago

Not in the short term but we can use this issue as a feature request and the team can evaluate it.

The main question here is what is the gap in terms of the scenario that cannot be done with TransactionalBatch. According to the documentation, TB is even more performant. If you can help shed some light on the gap, I guess that would help the team prioritize.

I see that it can be implement partially without sproc, but it's more like a question of migration now. We already have some code based on sproc , and if we choose current sdk to use and sproc based logic will be needed to migrate to golang and can bring some additional work need to take this to account

ealsur commented 8 months ago

When you mention migration, you mean migrate to use azcosmos? Where are you migrating from? Is this another golang Cosmos DB library that does have Stored Procedures support?

jcocchi commented 8 months ago

Or at least delete all in partition.

This is supported with the delete items by partition key Preview feature and doesn't need to be done via stored procedure. The Go SDK doesn't support this API yet, but you can add your feedback to support it on the existing issue here.

As for the stored procedures roadmap, we don't have plans to deprecate stored procedures in Cosmos DB. However, adding stored procedure APIs to Go isn't currently a high priority. Please consider @ealsur's tips for using transactional batch instead, and we will consider adding support for stored procedures directly in the future. Thanks!

vaskravchuk commented 8 months ago

When you mention migration, you mean migrate to use azcosmos? Where are you migrating from? Is this another golang Cosmos DB library that does have Stored Procedures support?

there is one which supports go-cosmosdb

vaskravchuk commented 8 months ago

Or at least delete all in partition.

This is supported with the delete items by partition key Preview feature and doesn't need to be done via stored procedure. The Go SDK doesn't support this API yet, but you can add your feedback to support it on the existing issue here.

As for the stored procedures roadmap, we don't have plans to deprecate stored procedures in Cosmos DB. However, adding stored procedure APIs to Go isn't currently a high priority. Please consider @ealsur's tips for using transactional batch instead, and we will consider adding support for stored procedures directly in the future. Thanks!

ok, thanks a lot for the information

ealsur commented 8 months ago

Added this to the backlog as a feature request

zarenner commented 8 months ago

Another potential use case for stored procedures - NoSQL API still has limitations on GROUP BY functionality (e.g. lack of ORDER BY support, limited aggregation functions, etc). While sprocs obviously have their own limitations, AFAIK they're the only workaround to perform these type of queries.

ealsur commented 8 months ago

Stored Procedures only execute within a Partition Key value. Which would be the same as executing the Group By + Order By query within the same Partition Key. The query engine used in both cases is the same, because its a single Partition query, there is no client-side aggregation involved, so both have the same limitations. How are they different in terms of support? Or do you mean that in the Stored procedure you can apply ordering of the results? But in that case, so can you in the application code after executing the query?

zarenner commented 8 months ago

Or do you mean that in the Stored procedure you can apply ordering of the results? But in that case, so can you in the application code after executing the query?

Yeah, I think the difference in that case is that you can't then apply post-ordering paging (OFFSET LIMIT or equivalent) without egressing the entire result set to the application. So if your goal is effectively GROUP BY ... ORDER BY ... LIMIT 25 a sproc lets you at least keep everything in the Cosmos engine before returning 25 results to the application, instead of returning e.g. 10000 results and then sorting and paging.

Another case would be where Cosmos doesn't support a desired aggregate function (e.g. STRING_AGG, distinct count, etc) and you want to "manually" perform the equivalent of a GROUP BY (in js) again without a large original result set leaving the engine. Basically what https://github.com/lmaccherone/documentdb-lumenize did.

lthela2 commented 7 months ago

Could you please provide an update on the timeline for the feature? I'm currently in the process of migrating to azcosmos, but the lack of support for stored procedures is preventing me from fully migrating to azcosmos.

We are currently using ExecuteStoredProcedure to delete resources by partition Key. Although Transactional Batch delete is an alternative, it's limited to 100 operations per batch, potentially causing partial deletes on failure. This could impact the behavior of our RP.

Quoting the limitations from Transaction Batch delete documentation:

The Azure Cosmos DB request size limit constrains the size of the TransactionalBatch payload to not exceed 2 MB, and the maximum execution time is 5 seconds.

There's a current limit of 100 operations per TransactionalBatch to ensure the performance is as expected and within SLAs.

Previously, the Transactional Batch delete documentation mentioned a limit of 100 operations per batch. However, recent updates to the documentation don't include this limit. It's unclear whether the limit still applies.

Prioritizing this feature will greatly help us with our azcosmos migration.

jcocchi commented 7 months ago

@lthela2 - the limit of 100 items per transactional batch still applies and is documented in the per request limits section of the service quotas page for consistency with the rest of the service limits. Sorry for the confusion!

Unfortunately support for stored procedures isn't on our near-term roadmap for the Go SDK. We may re-evaluate priority for this feature in the next several months.