Closed NotFound403 closed 4 years ago
After looking at what it is, it seems there is nothing detectron2 needs to do to support it. Anyone who can run a detectron2 model in pytorch should already be able to also use torchserve to run it.
Therefore closing. Please update if you have a concrete feature request.
yes i have done thx m8
@NotFound403 Do you have any examples on the detectron2 hosting? Are there any differences when creating the .mar file?
@NotFound403 @ppwwyyxx from my understanding for creating the .mar file using torch-model-archiver we need to have a model.py file. Would we need to write that from scratch or can it be done using the .yaml config file? Thanks!
@bigswede74 were you able to convert the detectron2 model to correct .mar file? Thanks!
@r5sb Yes I was able to get my .mar file created with a custom service.
model archiver commands
# build the .mar file
torch-model-archiver --model-name my-model --version 1.0 --serialized-file model/my-model.pth --extra-files model/config.yaml --handler model/detectron2-handler.py -f
@bigswede74 Thanks for your reply! Just for clarity, --extra-files model/config.yaml
is the detectron2 conf file saved with your custom configurations and model/detectron2-handler.py
is the custom handler.
Also, did you need to convert the my-model.pth file to torchscript first? My understanding was if a .pth file is provided a model.py architecture file is needed.
@r5sb You are correct, the model.py is a custom service python script to run the inference. Detectron2 is not supported out of the box.
@NotFound403 Do you have any examples on the detectron2 hosting? Are there any differences when creating the .mar file?
yeah, i have done it ! only a json with labels 、net backbone 、detector handler , you can see the examples in torchserve, it's easy
@bigswede74 were you able to convert the detectron2 model to correct .mar file? Thanks!
you can write the args on backbone script
@r5sb You are correct, the model.py is a custom service python script to run the inference. Detectron2 is not supported out of the box.
@bigswede74 care to give a few hints on how the --handler model/detectron2-handler.py
and --extra-files model/config.yaml
are created?
@bigswede74 if you can share both files as @ohjho says it would be great!
I would also be very interested to see code examples of this - thank you in advance!
EDIT:
Actually I just realised model.py is not needed for detectron2 since the models are saved as checkpoints, not in eager mode. An example of handler.py for detectron would come in handy though - anyone willing to share? @NotFound403, @bigswede74 ?
@tkaleczyc-ats You may find the handler code in the torchserve repo: https://github.com/pytorch/serve/tree/master/ts/torch_handler
Hello @bigswede74, @ohjho, @NotFound403 and @tkaleczyc-ats, I would have some questions about deploying a Detectron2 model thanks to TorchServe.
torch-model-archiver --model-name my-model --version 1.0 --serialized-file model/my-model.pth --extra-files model/config.yaml --handler model/detectron2-handler.py -f
In the command line above, shared previously by @bigswede74:
1) Does my-model.pth
correspond to model_final.pth
, located in output directory, which is obtained after a training session of Detectron2?
2) Does config.yaml
correspond to a configuration file that can be got, after a training session, thanks to a cfg.dump()
instruction?
Finally, as mentioned above by @tkaleczyc-ats, does handler.py
custom handler examples for Detectron2 have been shared previously? I have searched, but I haven't been able to find anything.
@MrKsinant to answer your questions above yes and yes.
@ohjho, @NotFound403, @tkaleczyc-ats As for a sample handler I apologize for the delay in posting a detectron2 torchserve handler, see below.
# custom service file
"""
ModelHandler defines a base model handler.
"""
# Some basic setup:
import detectron2
import os.path
import sys, io, json, time, random
import numpy as np
import cv2
import base64
# Setup detectron2 logger
from detectron2.utils.logger import setup_logger
setup_logger()
# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from os import path
from json import JSONEncoder
class ModelHandler(object):
"""
A base Model handler implementation.
"""
def __init__(self):
self.error = None
self._context = None
self._batch_size = 0
self.initialized = False
self.predictor = None
self.model_file = "rn50_segmentation_model_final.pth"
self.config_file = "mask_rcnn_R_50_FPN_3x.yaml"
def initialize(self, context):
"""
Initialize model. This will be called during model loading time
:param context: Initial context contains model server system properties.
:return:
"""
print("initializing starting")
print("File {} exists {}".format(self.model_file, str(path.exists(self.model_file))))
print("File {} exists {}".format(self.config_file, str(path.exists(self.config_file))))
try:
cfg = get_cfg()
cfg.merge_from_file(self.config_file)
cfg.MODEL.WEIGHTS = self.model_file
# set the testing threshold for this model
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
self.predictor = DefaultPredictor(cfg)
print("predictor built on initialize")
except AssertionError as error:
# Output expected AssertionErrors.
print(error)
except: # catch *all* exceptions
e = sys.exc_info()[0]
print("Error: {}".format(e))
self._context = context
self._batch_size = context.system_properties["batch_size"]
self.initialized = True
print("initialized")
def preprocess(self, batch):
"""
Transform raw input into model input data.
:param batch: list of raw requests, should match batch size
:return: list of preprocessed model input data
"""
assert self._batch_size == len(batch), "Invalid input batch size: {}".format(len(batch))
# Take the input data and pre-process it make it inference ready
print("pre-processing started for a batch of {}".format(len(batch)))
images = []
# batch is a list of requests
for request in batch:
for request_item in request:
print(request_item)
# each item in the list is a dictionary with a single body key, get the body of the request
request_body = request.get("body")
# read the bytes of the image
input = io.BytesIO(request_body)
# get our image
img = cv2.imdecode(np.fromstring(input.read(), np.uint8), 1)
# add the image to our list
images.append(img)
print("pre-processing finished for a batch of {}".format(len(batch)))
return images
def inference(self, model_input):
"""
Internal inference methods
:param model_input: transformed model input data
:return: list of inference output in NDArray
"""
# Do some inference call to engine here and return output
print("inference started for a batch of {}".format(len(model_input)))
outputs = []
for image in model_input:
# run our predictions
output = self.predictor(image)
outputs.append(output)
print("inference finished for a batch of {}".format(len(model_input)))
return outputs
def postprocess(self, inference_output):
"""
Return predict result in batch.
:param inference_output: list of inference output
:return: list of predict results
"""
start_time = time.time()
print("post-processing started at {} for a batch of {}".format(start_time, len(inference_output)))
responses = []
for output in inference_output:
# process predictions
predictions = output["instances"].to("cpu")
boxes = predictions.pred_boxes if predictions.has("pred_boxes") else None
scores = predictions.scores if predictions.has("scores") else None
classes = predictions.pred_classes.numpy() if predictions.has("pred_classes") else None
masks = (predictions.pred_masks > 0.5).squeeze().numpy() if predictions.has("pred_masks") else None
responses_json={'classes': classes, 'scores': scores, "boxes": boxes, "masks": masks }
responses.append(json.dumps(responses_json))
elapsed_time = time.time() - start_time
print("post-processing finished for a batch of {} in {}".format(len(inference_output), elapsed_time))
return responses
def handle(self, data, context):
"""
Call preprocess, inference and post-process functions
:param data: input data
:param context: mms context
"""
print("handling started")
# process the data through our inference pipeline
model_input = self.preprocess(data)
model_out = self.inference(model_input)
output = self.postprocess(model_out)
print("handling finished")
return output
_service = ModelHandler()
def handle(data, context):
if not _service.initialized:
_service.initialize(context)
if data is None:
return None
return _service.handle(data, context)
Hello @bigswede74, thank you for your answer!
Hi @bigswede74 and everyone, Thanks for the handler. I got this to work. Only issue is that with segmentation masks, json dumps takes a really long time. For example, with a 1280x720 (HD) pixel image, and eight detected objects, handler time is 110 ms if I don't return the masks, and 640 ms if I dump and return the masks as well. Any thoughts about how to get around this? I wish TorchServe could be like a remote function call, instead of just a http API.
@cbasavaraj see run length encoding.
Thanks. I did use RLE, posted a code snippet on pytorch/serve. Anyone interested, please see the link above which refers to this issue.
❓ will the detectron2 have the feature with torchserve
torchserve maybe a good deployment choice for detectron2 ,is it?