tensorflow / serving

A flexible, high-performance serving system for machine learning models
https://www.tensorflow.org/serving
Apache License 2.0
6.16k stars 2.19k forks source link

Display full message in GRPC exception log #2164

Open Prabha-Veerubhotla opened 1 year ago

Prabha-Veerubhotla commented 1 year ago

Feature Request

If this is a feature request, please fill out the following form in full:

Describe the problem the feature is intended to solve

While using Tensorflow serving, the exception message log is TRUNCATED. Ex:

Caused by: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: xxxx...TRUNCATED

Describe the solution

Full logs should be displayed without any truncation.

Additional context

The Tensorflow serving client side config for max inbound message size .maxInboundMessageSize() is set to int32max to match with server side config https://github.com/tensorflow/serving/blob/6a9d0fdafe326605cad1cae60dea0dd165bd2bb4/tensorflow_serving/model_servers/server.cc#L385

The origination code seems to be the following as per stack trace: https://github.com/grpc/grpc-java/blob/master/stub/src/main/java/io/grpc/stub/ClientCalls.java#L275

System information

Source code / logs

Prediction in Tensorflow serving: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto#L23

singhniraj08 commented 1 year ago

@Prabha-Veerubhotla,

TF serving binary has C++ dependencies. C++ logging has a hard coded limit on log size 15K which results in this truncation of log messages. A typical workaround of this problem is to output every line as a separate record. This approach can rapidly increase memory usage which can result in crash due to a memory exceeded error. Please let us know, if this is completely blocking you, we can try looking for alternatives. Thank you!

Prabha-Veerubhotla commented 1 year ago

Thank you for looking into this @singhniraj08. I actually found that this is the reason for the log truncation https://github.com/tensorflow/serving/blob/701e5c049f71aadb48463908b86be51f0171c0dd/tensorflow_serving/model_servers/grpc_status_util.cc#L29

Do we have a command line param to have custom error message limit instead of the default 1024 .

singhniraj08 commented 1 year ago

@Prabha-Veerubhotla,

Going through complete list of available command line flags, I couldn't find any flag or param to set custom error message limit. Thanks.

Prabha-Veerubhotla commented 1 year ago

@singhniraj08 how about adding a new command line argument to set the custom error message limit here.

I am currently blocked with this as I cannot look at the complete error message. There are some features that are failing in TF serving.

singhniraj08 commented 1 year ago

@Prabha-Veerubhotla,

Let us discuss this feature implementation internally and we will update this thread with updates. Thanks.

Prabha-Veerubhotla commented 1 year ago

thank you @singhniraj08

Prabha-Veerubhotla commented 1 year ago

hi @bmzhao is there an update on this ?

Prabha-Veerubhotla commented 1 year ago

@singhniraj08 were you able to follow up on this ?

Prabha-Veerubhotla commented 1 year ago

Happy to contribute with a pr if there is an agreement. We want this change to be part of 2.11 version.

Prabha-Veerubhotla commented 1 year ago

@singhniraj08 , @bmzhao any update on this issue ?

ndeepesh commented 12 months ago

@singhniraj08 @bmzhao Is there an update to this issue?

asamadiya commented 11 months ago

@ndeepesh @Prabha-Veerubhotla @singhniraj08 @bmzhao Can't we just do this? https://github.com/tensorflow/serving/pull/2185

Prabha-Veerubhotla commented 11 months ago

@asamadiya this should work. I am not sure if there will be any concern of printing a large message. The original PR was part of an internal change from tensor flow team https://github.com/tensorflow/serving/commit/c49fd96a8797baaee78c1771a7f48e8267c85ede