Open kaiogu opened 1 year ago
Hi @kaiogu,
thats actually a bit strange. Can you post your full handler code here? Your first example should actually result in an "number of batch response mismatched" error when triggering [this] guard as len(ret)==5 (https://github.com/pytorch/serve/blob/86d440041b663961c71a6262fe648111d85b27d8/ts/service.py#L141) (assuming you only send one request). Your last example with the json.dumps should trigger this because the return value is not a list. Having the full handler code could help debugging this.
Hi @mreso,
thanks for the quick reply. Sure thing:
import json
import logging
import os
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from ts.torch_handler.base_handler import BaseHandler
logger = logging.getLogger(__name__)
class TransformersClassifierHandler(BaseHandler):
"""
The handler takes an input string and returns the classification text
based on the serialized transformers checkpoint.
"""
def __init__(self):
super(TransformersClassifierHandler, self).__init__()
self.initialized = False
self.model = None
self.mapping = None
self.device = None
self.manifest = None
self.tokenizer = None
def initialize(self, ctx):
"""Loads the model.pt file and initializes the model object.
Instantiates Tokenizer for preprocessor to use
Loads labels to name mapping file for post-processing inference response
"""
self.manifest = ctx.manifest
properties = ctx.system_properties
model_dir = properties.get("model_dir")
self.device = torch.device(
"cuda:" + str(properties.get("gpu_id")) if torch.cuda.is_available() else "cpu"
)
# Read model serialize/pt file
serialized_file = self.manifest["model"]["serializedFile"]
model_pt_path = os.path.join(model_dir, serialized_file)
if not os.path.isfile(model_pt_path):
raise RuntimeError("Missing the model.pt or pytorch_model.bin file")
# Load model
self.model = AutoModelForSequenceClassification.from_pretrained(model_dir)
self.model.to(self.device)
self.model.eval()
logger.debug(f"Transformer model from path {0} loaded successfully".format(model_dir))
# Ensure to use the same tokenizer used during training
self.tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
# Read the mapping file, index to object name
mapping_file_path = os.path.join(model_dir, "index_to_name.json")
if os.path.isfile(mapping_file_path):
with open(mapping_file_path, mode="rt", encoding="utf8") as f:
self.mapping = json.load(f)
else:
logger.warning(
"Missing the index_to_name.json file. Inference output will not include class name."
)
self.initialized = True
def preprocess(self, data):
"""Preprocessing input request by tokenizing
Extend with your own preprocessing steps as needed
"""
sentences = data[0].get("data")
logger.info("Received text: '%s'", sentences)
# Tokenize the texts
tokenizer_args = (sentences,)
inputs = self.tokenizer(
*tokenizer_args,
padding="max_length",
max_length=128,
truncation=True,
return_tensors="pt",
)
return inputs
def inference(self, inputs):
"""Predict the class of a text using a trained transformer model."""
inference_output = self.model(inputs["input_ids"].to(self.device))
logger.info(f"TYPE OF PREDICTIONS: {inference_output}")
return inference_output.logits
def postprocess(self, logits):
"""Placeholder for post-processing the inference output."""
probabilities = torch.softmax(logits, dim=1)
top_k = torch.topk(probabilities, k=5)
top_k_labels = [self.mapping[str(i)] for i in top_k.indices[0].tolist()]
top_k_probabilities = top_k.values[0].tolist()
top_k_predictions = [
{"label": label, "probability": probability}
for label, probability in zip(top_k_labels, top_k_probabilities)
]
print(f"{top_k_predictions=}")
return top_k_predictions
PS: Solved by wrapping return list in another list:
return [top_k_predictions]
The error logs you mentioned would have helped though :)
🐛 Describe the bug
I am trying to return a python list of dicts from pytorch serve. It is a list containing the top 5 predictions of the model:
The returned valued get distorted, see below
Error logs
No error logs, but unexpected behaviour.
Installation instructions
I am running the containers locally and followed these instructions: https://cloud.google.com/blog/topics/developers-practitioners/pytorch-google-cloud-how-deploy-pytorch-models-vertex-ai
Model Packaing
https://cloud.google.com/blog/topics/developers-practitioners/pytorch-google-cloud-how-deploy-pytorch-models-vertex-ai
config.properties
No response
Versions
In the running container:
Repro instructions
When I query the model, only the first element of the list is returned:
If I try to convert the list of dicts to a string before returning (
return json.dumps(top_k_predictions
), the result gets distorted even worse:Possible Solution
No response