Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.33k stars 1.97k forks source link

More execution time to upsert records while using spring-data-cosmos saveAll function after moving to 5.8.0 version of spring-data-cosmos. #39100

Open vasantteja opened 7 months ago

vasantteja commented 7 months ago

Query/Question I updated my spring-data-cosmos jar from 5.3.0 to 5.8.0 version. I was excited about this as saveAll function was using bulk api under the hood when we are trying to write more than 1 record. I was under the assumption that saveAll will take less time than before as we are using bulk api. But it was opposite and bulk api took more time to update the records in cosmos than the non-bulk one. I have two questions.

  1. Will I be seeing more execution time if I am going to use bulk api in future?
  2. If yes, is there any flag or something I can set so that I can disable the use of bulk api and proceed ahead without it for saveAll.

We are running our apps on VMs hosted on Azure.

I am attaching the runtimes below:

Without bulk api(Using 5.3.0 version) Records-9 First-1251ms Second-992ms

With bulk api(Using 5.8.0 version) Records-9 First-1473ms Second-989ms

Why is this not a Bug or a feature Request? I am trying to understand the behavior of new version of the api. Right now this is not either impacting or does not require adding any new feature.

Setup (please complete the following information if applicable):

Netyyyy commented 7 months ago

@kushagraThapar please help take a look

kushagraThapar commented 7 months ago

@vasantteja thanks for raising this issue, usually bulk APIs should take lesser time and achieve high throughput when writing to Cosmos DB, however, it also depends on the distribution of the data that is being inserted, number of cores on the processor (on Azure VM in your case), and some other factors. You can read more about it here - https://learn.microsoft.com/en-us/azure/cosmos-db/bulk-executor-overview

However, that being said, in general it should be more efficient. @trande4884 - can you please take a look at this and see if there is any perf issue in our azure-spring-data-cosmos SDK?

trande4884 commented 7 months ago

@vasantteja what tool are you using to pull those execution times?

vasantteja commented 7 months ago

Thanks @kushagraThapar for the inputs. @trande4884 I was using StopWatch from org.springframework.util.StopWatch to time this method. The pseudocode is as follows: StopWatch timed = new StopWatch(); timed.start(); repository.saveAll(object); timed.stop();

vasantteja commented 6 months ago

@trande4884 Hi Trevor! Is there anything I can do to enhance the performance or is my process evaluating the time taken for this method wrong?

trande4884 commented 6 months ago

@vasantteja that framework seems to only be testing the runtime of the function, and not the execution time of the saveAll() query itself. Some of the additional time is likely the setup of the bulk operation but I have not had time to investigate yet, I'm hoping to have time to get to this next week to compare actual execution times.

In our readme there is information on setting up query metrics that would better track actual execution times and RU's. Here is more documentation: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/query-metrics

vasantteja commented 6 months ago

@trande4884 Thanks for the update. I will take a look at the documentation.

vasantteja commented 6 months ago

@trande4884 @kushagraThapar I have a small dumb question. Our saveAll query runs only once a week. In this case can we persist the connection using a parameter? If the connection is persisted I am assuming we will have a faster insertion times as we will do away with setUp of bulk operation.

kushagraThapar commented 6 months ago

@vasantteja - you cannot persist the connection. Connections created by Cosmos DB SDK are short lived. Even if you extend the connection timeout value to a higher value like a week, there is high chance of that connection getting dropped because of any movement of machines on the backend service or network blips. Machines restart all the time because of system and security updates. So it won't work.

However, in this regard, you can use a feature called as proactiveConnection Management. This will allow you to create connections upfront to all your partitions, and you can also maintain a healthy active connection throughout the application lifecycle. If for some reason the connection gets dropped, SDK will re-create the connection instantly.

You can leverage this class CosmosContainerProactiveInitConfig while creating CosmosClient through spring. This is the API in CosmosClientBuilder ->

/**
     * Sets the {@link CosmosContainerProactiveInitConfig} which enable warming up of caches and connections
     * associated with containers obtained from {@link CosmosContainerProactiveInitConfig#getCosmosContainerIdentities()} to replicas
     * obtained from the first <em>k</em> preferred regions where <em>k</em> evaluates to {@link CosmosContainerProactiveInitConfig#getProactiveConnectionRegionsCount()}.
     *
     * <p>
     *     Use the {@link CosmosContainerProactiveInitConfigBuilder} class to instantiate {@link CosmosContainerProactiveInitConfig} class
     * </p>
     * @param proactiveContainerInitConfig which encapsulates a list of container identities and no of
     *                                     proactive connection regions
     * @return current CosmosClientBuilder
     * */
    public CosmosClientBuilder openConnectionsAndInitCaches(CosmosContainerProactiveInitConfig proactiveContainerInitConfig) {
        this.proactiveContainerInitConfig = proactiveContainerInitConfig;
        return this;
    }