torchserve is release now

NotFound403 commented 4 years ago

❓ will the detectron2 have the feature with torchserve

torchserve maybe a good deployment choice for detectron2 ,is it?

ppwwyyxx commented 4 years ago

After looking at what it is, it seems there is nothing detectron2 needs to do to support it. Anyone who can run a detectron2 model in pytorch should already be able to also use torchserve to run it.

Therefore closing. Please update if you have a concrete feature request.

NotFound403 commented 4 years ago

yes i have done thx m8

bigswede74 commented 4 years ago

@NotFound403 Do you have any examples on the detectron2 hosting? Are there any differences when creating the .mar file?

r5sb commented 4 years ago

@NotFound403 @ppwwyyxx from my understanding for creating the .mar file using torch-model-archiver we need to have a model.py file. Would we need to write that from scratch or can it be done using the .yaml config file? Thanks!

r5sb commented 4 years ago

@bigswede74 were you able to convert the detectron2 model to correct .mar file? Thanks!

bigswede74 commented 4 years ago

@r5sb Yes I was able to get my .mar file created with a custom service.

model archiver commands

# build the .mar file
torch-model-archiver --model-name my-model --version 1.0 --serialized-file model/my-model.pth --extra-files model/config.yaml --handler model/detectron2-handler.py -f

r5sb commented 4 years ago

@bigswede74 Thanks for your reply! Just for clarity, --extra-files model/config.yaml is the detectron2 conf file saved with your custom configurations and model/detectron2-handler.py is the custom handler.

Also, did you need to convert the my-model.pth file to torchscript first? My understanding was if a .pth file is provided a model.py architecture file is needed.

bigswede74 commented 4 years ago

@r5sb You are correct, the model.py is a custom service python script to run the inference. Detectron2 is not supported out of the box.

NotFound403 commented 4 years ago

@NotFound403 Do you have any examples on the detectron2 hosting? Are there any differences when creating the .mar file?

yeah, i have done it ! only a json with labels 、net backbone 、detector handler , you can see the examples in torchserve, it's easy

NotFound403 commented 4 years ago

@bigswede74 were you able to convert the detectron2 model to correct .mar file? Thanks!

you can write the args on backbone script

ohjho commented 4 years ago

@r5sb You are correct, the model.py is a custom service python script to run the inference. Detectron2 is not supported out of the box.

@bigswede74 care to give a few hints on how the --handler model/detectron2-handler.py and --extra-files model/config.yaml are created?

vgr-gatv commented 4 years ago

@bigswede74 if you can share both files as @ohjho says it would be great!

tkaleczyc-ats commented 4 years ago

I would also be very interested to see code examples of this - thank you in advance!

EDIT:

Actually I just realised model.py is not needed for detectron2 since the models are saved as checkpoints, not in eager mode. An example of handler.py for detectron would come in handy though - anyone willing to share? @NotFound403, @bigswede74 ?

ruodingt commented 4 years ago

@tkaleczyc-ats You may find the handler code in the torchserve repo: https://github.com/pytorch/serve/tree/master/ts/torch_handler

MrKsinant commented 4 years ago

Hello @bigswede74, @ohjho, @NotFound403 and @tkaleczyc-ats, I would have some questions about deploying a Detectron2 model thanks to TorchServe.

torch-model-archiver --model-name my-model --version 1.0 --serialized-file model/my-model.pth --extra-files model/config.yaml --handler model/detectron2-handler.py -f

In the command line above, shared previously by @bigswede74: 1) Does my-model.pth correspond to model_final.pth, located in output directory, which is obtained after a training session of Detectron2? 2) Does config.yaml correspond to a configuration file that can be got, after a training session, thanks to a cfg.dump() instruction?

Finally, as mentioned above by @tkaleczyc-ats, does handler.py custom handler examples for Detectron2 have been shared previously? I have searched, but I haven't been able to find anything.

bigswede74 commented 4 years ago

@MrKsinant to answer your questions above yes and yes.

Yes, --seralized-file is your final model file and can be named anything you choose
Yes, --extra-files is your model config yaml file that is generated by using cfg.dump()

@ohjho, @NotFound403, @tkaleczyc-ats As for a sample handler I apologize for the delay in posting a detectron2 torchserve handler, see below.

# custom service file

"""
ModelHandler defines a base model handler.
"""

# Some basic setup:
import detectron2
import os.path
import sys, io, json, time, random
import numpy as np
import cv2
import base64

# Setup detectron2 logger
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from os import path
from json import JSONEncoder

class ModelHandler(object):
    """
    A base Model handler implementation.
    """

    def __init__(self):
        self.error = None
        self._context = None
        self._batch_size = 0
        self.initialized = False
        self.predictor = None
        self.model_file = "rn50_segmentation_model_final.pth"
        self.config_file = "mask_rcnn_R_50_FPN_3x.yaml"  

    def initialize(self, context):
        """
        Initialize model. This will be called during model loading time
        :param context: Initial context contains model server system properties.
        :return:
        """
        print("initializing starting")

        print("File {} exists {}".format(self.model_file, str(path.exists(self.model_file))))
        print("File {} exists {}".format(self.config_file, str(path.exists(self.config_file))))

        try:
            cfg = get_cfg()
            cfg.merge_from_file(self.config_file)
            cfg.MODEL.WEIGHTS = self.model_file

            # set the testing threshold for this model
            cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5

            self.predictor = DefaultPredictor(cfg)

            print("predictor built on initialize")
        except AssertionError as error:
            # Output expected AssertionErrors.
            print(error)
        except:  # catch *all* exceptions
            e = sys.exc_info()[0]
            print("Error: {}".format(e))

        self._context = context
        self._batch_size = context.system_properties["batch_size"]
        self.initialized = True
        print("initialized")

    def preprocess(self, batch):
        """
        Transform raw input into model input data.
        :param batch: list of raw requests, should match batch size
        :return: list of preprocessed model input data
        """
        assert self._batch_size == len(batch), "Invalid input batch size: {}".format(len(batch))

        # Take the input data and pre-process it make it inference ready
        print("pre-processing started for a batch of {}".format(len(batch)))

        images = []

        # batch is a list of requests
        for request in batch:
            for request_item in request:
                print(request_item)

            # each item in the list is a dictionary with a single body key, get the body of the request
            request_body = request.get("body")

            # read the bytes of the image
            input = io.BytesIO(request_body)

            # get our image
            img = cv2.imdecode(np.fromstring(input.read(), np.uint8), 1)

            # add the image to our list
            images.append(img)

        print("pre-processing finished for a batch of {}".format(len(batch)))

        return images

    def inference(self, model_input):
        """
        Internal inference methods
        :param model_input: transformed model input data
        :return: list of inference output in NDArray
        """

        # Do some inference call to engine here and return output
        print("inference started for a batch of {}".format(len(model_input)))

        outputs = []

        for image in model_input:
            # run our predictions
            output = self.predictor(image)

            outputs.append(output)

        print("inference finished for a batch of {}".format(len(model_input)))

        return outputs

    def postprocess(self, inference_output):

        """
        Return predict result in batch.
        :param inference_output: list of inference output
        :return: list of predict results
        """
        start_time = time.time()

        print("post-processing started at {} for a batch of {}".format(start_time, len(inference_output)))

        responses = []

        for output in inference_output:

            # process predictions
            predictions = output["instances"].to("cpu")
            boxes = predictions.pred_boxes if predictions.has("pred_boxes") else None
            scores = predictions.scores if predictions.has("scores") else None
            classes = predictions.pred_classes.numpy() if predictions.has("pred_classes") else None
            masks = (predictions.pred_masks > 0.5).squeeze().numpy() if predictions.has("pred_masks") else None

            responses_json={'classes': classes, 'scores': scores, "boxes": boxes, "masks": masks }

            responses.append(json.dumps(responses_json))

        elapsed_time = time.time() - start_time

        print("post-processing finished for a batch of {} in {}".format(len(inference_output), elapsed_time))

        return responses

    def handle(self, data, context):
        """
        Call preprocess, inference and post-process functions
        :param data: input data
        :param context: mms context
        """
        print("handling started")

        # process the data through our inference pipeline
        model_input = self.preprocess(data)
        model_out = self.inference(model_input)
        output = self.postprocess(model_out)

        print("handling finished")

        return output  

_service = ModelHandler()

def handle(data, context):
    if not _service.initialized:
        _service.initialize(context)

    if data is None:
        return None

    return _service.handle(data, context)

MrKsinant commented 4 years ago

Hello @bigswede74, thank you for your answer!

cbasavaraj commented 3 years ago

Hi @bigswede74 and everyone, Thanks for the handler. I got this to work. Only issue is that with segmentation masks, json dumps takes a really long time. For example, with a 1280x720 (HD) pixel image, and eight detected objects, handler time is 110 ms if I don't return the masks, and 640 ms if I dump and return the masks as well. Any thoughts about how to get around this? I wish TorchServe could be like a remote function call, instead of just a http API.

bigswede74 commented 3 years ago

@cbasavaraj see run length encoding.

cbasavaraj commented 3 years ago

Thanks. I did use RLE, posted a code snippet on pytorch/serve. Anyone interested, please see the link above which refers to this issue.

facebookresearch / detectron2

torchserve is release now #1289

❓ will the detectron2 have the feature with torchserve