Closed madogar closed 3 weeks ago
Welcome @madogar! It looks like this is your first PR to milvus-io/milvus-sdk-java 🎉
@madogar We don't intend to expose direct interceptor input to users.
Java SDK follows the python sdk logic. In python sdk, the interceptor is used to pass db_name and authorization. https://github.com/milvus-io/pymilvus/blob/3198bd2614446cd3b4266751394d6d16dcb6a8ef/pymilvus/client/grpc_handler.py#L236 https://github.com/milvus-io/pymilvus/blob/3198bd2614446cd3b4266751394d6d16dcb6a8ef/pymilvus/client/grpc_handler.py#L239
The python sdk doesn't provide direct interceptor for users, either. We only expose parameters that we can support.
@yhmo My bad, forgot to set the context. Here is the server side change to set client_request_id in logs: https://github.com/milvus-io/milvus/pull/32264
As of today there is no way to set the param from java SDK, hence this change where would implement an interceptor like below and set it at client level: `public class TraceIdClientInterceptor implements ClientInterceptor { @Override public <ReqT, RespT> ClientCall<ReqT, RespT> interceptCall(MethodDescriptor<ReqT, RespT> method, CallOptions callOptions, Channel next) {
return new ForwardingClientCall
.SimpleForwardingClientCall<ReqT, RespT>(next.newCall(method, callOptions)) {
@Override
public void start(ClientCall.Listener<RespT> responseListener, Metadata headers) {
headers.put(Metadata.Key.of("client_request_id", Metadata.ASCII_STRING_MARSHALLER), "999-" + UUID.randomUUID().toString());
super.start(responseListener, headers);
}
};
}
}`
The biggest challenge is setting this (unique value)param for every gRPC call.
@madogar This pr is to set a client ID for the connection, and pass the client ID by ClientInterceptor. In my opinion, I mean no need to expose ClientInterceptor to users. Just like the database name:
if (StringUtils.isNotEmpty(connectParam.getDatabaseName())) {
metadata.put(Metadata.Key.of("dbname", Metadata.ASCII_STRING_MARSHALLER), connectParam.getDatabaseName());
}
List<ClientInterceptor> clientInterceptors = new ArrayList<>();
clientInterceptors.add(MetadataUtils.newAttachHeadersInterceptor(metadata));
We can add a method to the ConnectParam like this:
if (StringUtils.isNotEmpty(connectParam.getClientID())) {
metadata.put(Metadata.Key.of("client_request_id", Metadata.ASCII_STRING_MARSHALLER), connectParam.getClientID());
}
List<ClientInterceptor> clientInterceptors = new ArrayList<>();
clientInterceptors.add(MetadataUtils.newAttachHeadersInterceptor(metadata));
The ConnectParam.withClientInterceptors() directly exposes the grpc class ClientInterceptor to users, which is not a good design. It might be out of control if users pass ClientInterceptor freely to the server.
Just let the user input an ID by ConnectParam.withClientID(String cliecntID)
, and internally encapsulate the ID with a ClientInterceptor.
@yhmo My thought, consider this scenario:
If we don't do this and follow the approach to explicitly pass it, multiple services has to intercept it and pass it explicitly. I think there could be scenarios where client wants to pass it explicitly but in most of enterprise scenarios there is common layer which takes care of intercepting transactionId to keep it implicit.
cc: @xiaofan-luan
@prrs I just discussed with @xiaofan-luan about this pr. He agrees with me that it is not a good design to expose the grpc.ClientInterceptor to sdk users. The input of the grpc interceptor should be controlled.
@yhmo @xiaofan-luan We need unique transaction ID per API call, so it can't be at connection level. I am dropping a suggestion. Please help to conclude this or suggest a solution which can solve the problem in hand. Thanks.
My thought:
Allowing the Client SDK to Intercept and Use Transaction ID Pros: Ease of Use: Clients do not have to manage transaction IDs, making the SDK simpler and more user-friendly. Consistency: The SDK can ensure that transaction IDs are generated and used consistently across all requests. Reduced Errors: There is less chance of errors related to transaction ID management by the client. Cons: Lack of Control: Clients have less control over transaction IDs, which might be necessary for advanced use cases. Transparency: It might be less transparent to clients how transaction IDs are being managed.
Making it Explicit for Client to Pass Transaction ID Pros: Control: Clients have full control over the transaction IDs, which can be beneficial for custom transaction management and tracking. Transparency: Clients are fully aware of how transaction IDs are being used and managed. Flexibility: Clients can implement their own logic for generating and managing transaction IDs, which might be necessary for complex applications. Cons: Complexity: This approach adds complexity for clients, who must manage transaction IDs themselves. Potential for Errors: Clients might make mistakes in generating or using transaction IDs, leading to potential issues. Recommendations Hybrid Approach: One way to balance these pros and cons is to implement a hybrid approach:
Default Behavior: By default, the SDK can generate and manage transaction IDs automatically. This makes it easy for most users who do not need to manage transaction IDs themselves. Advanced Option: Provide an option for clients to explicitly pass their own transaction IDs if they need to. This can be done through additional parameters or configuration options.
Does the transaction ID generator rely on the interceptCall() of ClientInterceptor? I mean: is the transaction ID generated inside the interceptCall()?
Does the transaction ID generator rely on the interceptCall() of ClientInterceptor? I mean: is the transaction ID generated inside the interceptCall()?
@yhmo no the transaction id(trace id) wouldn't be generated inside the interceptCall(). Eventually in our JAVA app we would create a thread local variable whose value would be used in the interceptCall() for every grpc call. With this logic we would make sure every thread making a grpc call would set its own transaction id.
PS: In the above example in the thread I might have hardcoded the value in the interceptCall() for testing reason.
Could you show me an example of how your app achieves "every thread making a grpc call would set its own transaction id" by the interceptCall()?
Could you show me an example of how your app achieves "every thread making a grpc call would set its own transaction id" by the interceptCall()?
Here is the code where we set the traceId/transactionId in thread local in the Application class:
public static final ThreadLocal<String> traceIdContext = new ThreadLocal<>();
public void setTraceIdContext(String traceId) {
traceIdContext.set(traceId);
}
Here is how we would use it in intercept call:
public <ReqT, RespT> ClientCall<ReqT, RespT> interceptCall(MethodDescriptor<ReqT, RespT> method, CallOptions callOptions, Channel next) {
return new ForwardingClientCall
.SimpleForwardingClientCall<ReqT, RespT>(next.newCall(method, callOptions)) {
@Override
public void start(ClientCall.Listener<RespT> responseListener, Metadata headers) {
headers.put(Metadata.Key.of("client_request_id", Metadata.ASCII_STRING_MARSHALLER), Application.traceIdContext.get());
super.start(responseListener, headers);
}
};
}
Ok, I see. traceid is not a const value, it is unique for each thread. Concurrent threads use the same MilvusClient to call rpc interfaces, different threads have different traceid, right?
Ok, I see. traceid is not a const value, it is unique for each thread. Concurrent threads use the same MilvusClient to call rpc interfaces, different threads have different traceid, right?
yes! so, are you now aligned with client interceptor logic?
I didn't find a workaround to avoid exposing grpc interceptor, it seems your pr is the only solution to handle the use case.
@xiaofan-luan I think I can accept this pr. What do you think?
@madogar We still insist that the ClientInteceptor should be hidden inside the ConnectParam. This is our proposal: Hide the constructor of ClientInterceptor inside the ConnectParam, users provide a ThreadLocal object to pass the trace ID. The meta key "client_request_id" should be consistent with the server side, users no need to know this meta key.
public class ConnectParam {
......
private final List<ClientInterceptor> clientInterceptors;
protected ConnectParam(@NonNull Builder builder) {
......
this.clientInterceptors = builder.clientInterceptors;
}
public static Builder newBuilder() {
private List<ClientInterceptor> clientInterceptors = new ArrayList<>();
public Builder withTraceID(@NonNull ThreadLocal<String> traceIdContext) {
clientInterceptors.add(new ClientInterceptor() {
@Override
public <ReqT, RespT> ClientCall<ReqT, RespT> interceptCall(MethodDescriptor<ReqT, RespT> method, CallOptions callOptions, Channel next) {
return new ForwardingClientCall.SimpleForwardingClientCall<ReqT, RespT>(next.newCall(method, callOptions)) {
@Override
public void start(ClientCall.Listener<RespT> responseListener, Metadata headers) {
headers.put(Metadata.Key.of("client_request_id", Metadata.ASCII_STRING_MARSHALLER), traceIdContext.get());
super.start(responseListener, headers);
}
};
}
});
return this;
}
}
}
@yhmo @xiaofan-luan I have raised a new PR with the changes we aligned, please review: https://github.com/milvus-io/milvus-sdk-java/pull/955
close this thread and switch to #955
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: madogar To complete the pull request process, please assign yhmo after the PR has been reviewed. You can assign the PR to them by writing
/assign @yhmo
in a comment when ready.The full list of commands accepted by this bot can be found here.