Open hariom-qure opened 9 months ago
There is a parameter max_response_size. Could you please increase the value to see if it works for you?
@hariom-qure
As @lxning suggested increasing max_response_size
can help in many cases.
Sometimes, when I have had huge data to send back, in addition to increasing max_response_size
, I have used something like this in postprocess to solve this
def postprocess(self, data):
# Convert the mask array to a string object
class NumpyArrayEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.ndarray):
# return obj.tolist()
return "__nparray__" + json.dumps(obj.tolist())
return json.JSONEncoder.default(self, obj)
json_data = json.dumps({"data": data}, cls=NumpyArrayEncoder)
# Compress the string
compressed_data = zlib.compress(json_data.encode("utf-8"))
# Serialize the compressed data using Pickle
serialized_data = pickle.dumps(compressed_data)
# Encode the serialized data as Base64
base64_encoded_data = base64.b64encode(serialized_data).decode("utf-8")
@hariom-qure I suggest to try this code if you want.
def postprocess(self, data):
length = self._get_len_batch(data[0])
print("size of fracture tensor", data[1]["other_fracture"].size())
result = [self._single_tuple(data, i) for i in range(length)]
output = []
output.append({"result": result})
return result
I had a simile problem and i resolve with this.
🐛 Describe the bug
We have a model which essentially does image segmentation of sorts.
The output tensor is of this size:
[batch, 920, 920]
, fp32.I keep getting broken pipe errors in this:
From my debugging, it essentially fails after I return this tensor from my
postprocess
method in base handler.Is there a limit to response size for torchserve?
Thanks for the help!
Error logs
the main container logs:
Model logs
Installation instructions
Using docker, simply ran the stock image in dockerhub
compose file:
Model Packaing
I simply take a tensor as input and return raw tensor generated by model in output.
Essentially I get a
tuple[dict[str, Tensor], dict[str, Tensor]]
from the model, all tensor values would have the same size and have the batch size as first dimension.handler
config.properties
default of docker image
pytorch/torchserve:latest-gpu
, pulled approximately a week agoVersions
pip list output (cant find the script in docker container):
summary:
full output:
Repro instructions
Possible Solution
I feel its the response size, might be wrong. I tried serializing it to bytes in postprocess, but its not able to finish anything more than a batch of size 1. Used this for serializing (no specific reason for choosing this, i found it on google randomly):