milvus-io / milvus-sdk-java

Java SDK for Milvus.
https://milvus.io
Apache License 2.0
383 stars 154 forks source link

High CPU Usage During Insertions in milvus-sdk-java #1061

Open LeePui opened 1 week ago

LeePui commented 1 week ago

Hi, everyone, thank Hi everyone,

Thank you for providing such a convenient Java SDK; it has been very useful.

While using version 2.4.3 of the milvus-sdk-java, I have encountered some performance issues. Here are some metrics and analysis that I have gathered.

When performing insertions in a single thread, I noticed unusually high CPU usage. After profiling with async-profiler, I pinpointed the most time-consuming operation at this line: AbstractMilvusGrpcClient.java#L1569.

public R<MutationResult> insert(@NonNull InsertParam requestParam) {
        ......
        logDebug(requestParam.toString());
        ......
}

protected void logDebug(String msg, Object... params) {
    if (logLevel.ordinal() <= LogLevel.Debug.ordinal()) {
        logger.debug(msg, params);
    }
}

The attached flame graph can attest to this issue. image

The high CPU usage seems to be caused by premature calls to toString. In practice, when I set the log level to INFO, there is no need for the toString method to be called. I suggest checking the log level before calling toString.

Thank you for considering this improvement.

yhmo commented 1 week ago

Thanks for pointing out this problem. I didn't realize it was a problem before. The InsertParam.toString() is implemented by lombok annotation @ToString, which parses all the vectors to a long text like "[1.1234, 2.2234, ....]". It becomes a bottleneck when the inserted batch is large.

For the requests that could pass large/complicated data, we should manually customize the toString() method. For insertParam, we just want to print out the target collection name, the number of vectors, no need to print out all the vectors. I will make a change for this, it will take effect in the next minor version.

xiaofan-luan commented 1 week ago

good catch!

yhmo commented 5 days ago

Fixed by this pr: https://github.com/milvus-io/milvus-sdk-java/pull/1064