pytorch / serve

Serve, optimize and scale PyTorch models in production
https://pytorch.org/serve/
Apache License 2.0
4.17k stars 850 forks source link

Not able to get the data for inference when using custom handler #2841

Closed yogendra-yatnalkar closed 9 months ago

yogendra-yatnalkar commented 10 months ago

I team, I have created my own custom handler by referencing to the base-handler and the vision-handler. What I am observing is that, when I pass data to the model for inference, the data is not reaching to the hosted model endpoint.

The exact error I am getting is:

2023-12-09T20:08:03,580 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG - Invoking custom service failed.
2023-12-09T20:08:03,580 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2023-12-09T20:08:03,580 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG -   File "/opt/conda/envs/pytorch/lib/python3.10/site-packages/ts/service.py", line 120, in predict
2023-12-09T20:08:03,581 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG -     ret = self._entry_point(input_batch, self.context)
2023-12-09T20:08:03,581 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG -   File "/tmp/models/6ffe80d83e5341da81fe21bda0d735e0/custom_handler.py", line 139, in handle
2023-12-09T20:08:03,581 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG -     model_input = self.data_preprocess(data)
2023-12-09T20:08:03,582 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG -   File "/tmp/models/6ffe80d83e5341da81fe21bda0d735e0/custom_handler.py", line 91, in data_preprocess
2023-12-09T20:08:03,583 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG -     image = Image.open(io.BytesIO(image))
2023-12-09T20:08:03,585 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG -   File "/opt/conda/envs/pytorch/lib/python3.10/site-packages/PIL/Image.py", line 3280, in open
2023-12-09T20:08:03,586 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG -     raise UnidentifiedImageError(msg)
2023-12-09T20:08:03,586 [INFO ] W-9000-vit_l_16_1.0-stdout MODEL_LOG - PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f7677de3ce0>

When I printed my "data" before passing it for preprocessing, this is what I got:

2023-12-09T19:43:42,421 [INFO ] W-9000-vit_l__1.0-stdout MODEL_LOG - data:  [{'data': bytearray(b'{"payload":{"allShortcutsEnabled":false,"fileTree":{"examples/image_classifier/mnist/test_data":{"items":[{"name":"0.png","path":"examples/image_classifier/mnist/test_data/0.png","contentType":"file"},{"name":"1.png","path":"examples/image_classifier/mnist/test_data/1.png","contentType":"file"},{"name":"2.png","path":"examples/image_classifier/mnist/test_data/2.png","contentType":"file"},{"name":"3.png","path":"examples/image_classifier/mnist/test_data/3.png","contentType":"file"},{"name":"4.png","path":"examples/image_classifier/mnist/test_data/4.png","contentType":"file"},{"name":"5.png","path":"examples/image_classifier/mnist/test_data/5.png","contentType":"file"},{"name":"6.png","path":"examples/image_classifier/mnist/test_data/6.png","contentType":"file"},{"name":"7.png","path":"examples/image_classifier/mnist/test_data/7.png","contentType":"file"},{"name":"8.png","path":"examples/image_classifier/mnist/test_data/8.png","contentType":"file"},{"name":"9.png","path":"examples/image_classifier/mnist/test_data/9.png","contentType":"file"}],"totalCount":10},"examples/image_classifier/mnist":{"items":[{"name":"screenshots","path":"examples/image_classifier/mnist/screenshots","contentType":"directory"},{"name":"test_data","path":"examples/image_classifier/mnist/test_data","contentType":"directory"},{"name":"torchdata","path":"examples/image_classifier/mnist/torchdata","contentType":"directory"},{"name":"Docker.md","path":"examples/image_classifier/mnist/Docker.md","contentType":"file"},{"name":"README.md","path":"examples/image_classifier/mnist/README.md","contentType":"file"},{"name":"config.properties","path":"examples/image_classifier/mnist/config.properties","contentType":"file"},{"name":"mnist.py","path":"examples/image_classifier/mnist/mnist.py","contentType":"file"},{"name":"mnist_cnn.pt","path":"examples/image_classifier/mnist/mnist_cnn.pt","contentType":"file"},{"name":"mnist_handler.py","path":"examples/image_classifier/mnist/mnist_handler.py","contentType":"file"},{"name":"mnist_ts.json","path":"examples/image_classifier/mnist/mnist_ts.json","contentType":"file"}],"totalCount":10},"examples/image_classifier":{"items":[{"name":"alexnet","path":"examples/image_classifier/alexnet","contentType":"directory"},{"name":"densenet_161","path":"examples/image_classifier/densenet_161","contentType":"directory"},{"name":"mnist","path":"examples/image_classifier/mnist","contentType":"directory"},{"name":"near_real_time_video","path":"examples/image_classifier/near_real_time_video","contentType":"directory"},{"name":"resnet_152_batch","path":"examples/image_classifier/resnet_152_batch","contentType":"directory"},{"name":"resnet_18","path":"examples/image_classifier/resnet_18","contentType":"directory"},{"name":"squeezenet","path":"examples/image_classifier/squeezenet","contentType":"directory"},{"name":"vgg_16","path":"examples/image_classifier/vgg_16","contentType":"directory"},{"name":"README.md","path":"examples/image_classifier/README.md","contentType":"file"},{"name":"compile.json","path":"examples/image_classifier/compile.json","contentType":"file"},{"name":"index_to_name.json","path":"examples/image_classifier/index_to_name.json","contentType":"file"},{"name":"kitten.jpg","path":"examples/image_classifier/kitten.jpg","contentType":"file"}],"totalCount":12},"examples":{"items":[{"name":"FasterTransformer_HuggingFace_Bert","path":"examples/FasterTransformer_HuggingFace_Bert","contentType":"directory"},{"name":"Huggingface_Transformers","path":"examples/Huggingface_Transformers","contentType":"directory"},{"name":"LLM","path":"examples/LLM","contentType":"directory"},{"name":"MMF-activity-recognition","path":"examples/MMF-activity-recognition","contentType":"directory"},{"name":"Workflows","path":"examples/Workflows","contentType":"directory"},{"name":"asr_rnnt_emformer","path":"examples/asr_rnnt_emformer","contentType":"directory"},{"name":"benchmarking","path":"examples/benchmarking","contentType":"directory"},{"name":"captum","path":"examples/captum","contentType":"directory"},{"name":"cloud_storage_stream_inference","path":"examples/cloud_storage_stream_inference","contentType":"directory"},{"name":"cloudformation","path":"examples/cloudformation","contentType":"directory"},{"name":"custom_metrics","path":"examples/custom_metrics","contentType":"directory"},{"name":"dcgan_fashiongen","path":"examples/dcgan_fashiongen","contentType":"directory"},{"name":"diffusers","path":"examples/diffusers","contentType":"directory"},{"name":"image_classifier","path":"examples/image_classifier","contentType":"directory"},{"name":"image_segmenter","path":"examples/image_segmenter","contentType":"directory"},{"name":"images","path":"examples/images","contentType":"directory"},{"name":"instruction_embedding","path":"examples/instruction_embedding","contentType":"directory"},{"name":"intel_extension_for_pytorch","path":"examples/intel_extension_for_pytorch","contentType":"directory"},{"name":"large_models","path":"examples/large_models","contentType":"directory"},{"name":"micro_batching","path":"examples/micro_batching","contentType":"directory"},{"name":"nmt_transformer","path":"examples/nmt_transformer","contentType":"directory"},{"name":"nvidia_dali","path":"examples/nvidia_dali","contentType":"directory"},{"name":"object_detector","path":"examples/object_detector","contentType":"directory"},{"name":"pt2","path":"examples/pt2","contentType":"directory"},{"name":"speech2text_wav2vec2","path":"examples/speech2text_wav2vec2","contentType":"directory"},{"name":"text_classification","path":"examples/text_classification","contentType":"directory"},{"name":"text_classification_with_scriptable_tokenizer","path":"examples/text_classification_with_scriptable_tokenizer","contentType":"directory"},{"name":"text_to_speech_synthesizer","path":"examples/text_to_speech_synthesizer","contentType":"directory"},{"name":"torch_tensorrt","path":"examples/torch_tensorrt","contentType":"directory"},{"name":"torchrec_dlrm","path":"examples/torchrec_dlrm","contentType":"directory"},{"name":"README.md","path":"examples/README.md","contentType":"file"}],"totalCount":31},"":{"items":[{"name":".github","path":".github","contentType":"directory"},{"name":"benchmarks","path":"benchmarks","contentType":"directory"},{"name":"binaries","path":"binaries","contentType":"directory"},{"name":"ci","path":"ci","contentType":"directory"},{"name":"docker","path":"docker","contentType":"directory"},{"name":"docs","path":"docs","contentType":"directory"},{"name":"examples","path":"examples","contentType":"directory"},{"name":"frontend","path":"frontend","contentType":"directory"},{"name":"kubernetes","path":"kubernetes","contentType":"directory"},{"name":"model-archiver","path":"model-archiver","contentType":"directory"},{"name":"plugins","path":"plugins","contentType":"directory"},{"name":"requirements","path":"requirements","contentType":"directory"},{"name":"serving-sdk","path":"serving-sdk","contentType":"directory"},{"name":"test","path":"test","contentType":"directory"},{"name":"ts","path":"ts","contentType":"directory"},{"name":"ts_scripts","path":"ts_scripts","contentType":"directory"},{"name":"workflow-archiver","path":"workflow-archiver","contentType":"directory"},{"name":".gitignore","path":".gitignore","contentType":"file"},{"name":".pre-commit-config.yaml","path":".pre-commit-config.yaml","contentType":"file"},{"name":"CODE_OF_CONDUCT.md","path":"CODE_OF_CONDUCT.md","contentType":"file"},{"name":"CONTRIBUTING.md","path":"CONTRIBUTING.md","contentType":"file"},{"name":"LICENSE","path":"LICENSE","contentType":"file"},{"name":"MANIFEST.in","path":"MANIFEST.in","contentType":"file"},{"name":"PyPiDescription.rst","path":"PyPiDescription.rst","contentType":"file"},{"name":"README.md","path":"README.md","contentType":"file"},{"name":"SECURITY.md","path":"SECURITY.md","contentType":"file"},{"name":"_config.yml","path":"_config.yml","contentType":"file"},{"name":"codecov.yml","path":"codecov.yml","contentType":"file"},{"name":"link_check_config.json","path":"link_check_config.json","contentType":"file"},{"name":"mypy.ini","path":"mypy.ini","contentType":"file"},{"name":"setup.py","path":"setup.py","contentType":"file"},{"name":"torchserve_sanity.py","path":"torchserve_sanity.py","contentType":"file"}],"totalCount":32}},"fileTreeProcessingTime":16.963544,"foldersToFetch":[],"reducedMotionEnabled":null,"repo":{"id":212488700,"defaultBranch":"master","name":"serve","ownerLogin":"pytorch","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2019-10-03T03:17:43.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/21003710?v=4","public":true,"private":false,"isOrgOwned":true},"symbolsExpanded":false,"treeExpanded":true,"refInfo":{"name":"master","listCacheKey":"v0:1696206430.0","canEdit":false,"refType":"branch","currentOid":"253c2057d50fa904317c823c1575d847f40c4df0"},"path":"examples/image_classifier/mnist/test_data/0.png","currentUser":null,"blob":{"rawLines":null,"stylingDirectives":null,"csv":null,"csvError":null,"dependabotInfo":{"showConfigurationBanner":false,"configFilePath":null,"networkDependabotPath":"/pytorch/serve/network/updates","dismissConfigurationNoticePath":"/settings/dismiss-notice/dependabot_configuration_notice","configurationNoticeDismissed":null,"repoAlertsPath":"/pytorch/serve/security/dependabot","repoSecurityAndAnalysisPath":"/pytorch/serve/settings/security_analysis","repoOwnerIsOrg":true,"currentUserCanAdminRepo":false},"displayName":"0.png","displayUrl":"https://github.com/pytorch/serve/blob/master/examples/image_classifier/mnist/test_data/0.png?raw=true","headerInfo":{"blobSize":"272 Bytes","deleteInfo":{"deleteTooltip":"You must be signed in to make or propose changes"},"editInfo":{"editTooltip":"You must be signed in to make or propose changes"},"ghDesktopPath":"https://desktop.github.com","gitLfsPath":null,"onBranch":true,"shortPath":"a193c47","siteNavLoginPath":"/login?return_to=https%3A%2F%2Fgithub.com%2Fpytorch%2Fserve%2Fblob%2Fmaster%2Fexamples%2Fimage_classifier%2Fmnist%2Ftest_data%2F0.png","isCSV":false,"isRichtext":false,"toc":null,"lineInfo":{"truncatedLoc":null,"truncatedSloc":null},"mode":"file"},"image":true,"isCodeownersFile":null,"isPlain":false,"isValidLegacyIssueTemplate":false,"issueTemplateHelpUrl":"https://docs.github.com/articles/about-issue-and-pull-request-templates","issueTemplate":null,"discussionTemplate":null,"language":null,"languageID":null,"large":false,"loggedIn":false,"newDiscussionPath":"/pytorch/serve/discussions/new","newIssuePath":"/pytorch/serve/issues/new","planSupportInfo":{"repoIsFork":null,"repoOwnedByCurrentUser":null,"requestFullPath":"/pytorch/serve/blob/master/examples/image_classifier/mnist/test_data/0.png","showFreeOrgGatedFeatureMessage":null,"showPlanSupportBanner":null,"upgradeDataAttributes":null,"upgradePath":null},"publishBannersInfo":{"dismissActionNoticePath":"/settings/dismiss-notice/publish_action_from_dockerfile","dismissStackNoticePath":"/settings/dismiss-notice/publish_stack_from_file","releasePath":"/pytorch/serve/releases/new?marketplace=true","showPublishActionBanner":false,"showPublishStackBanner":false},"renderImageOrRaw":true,"richText":null,"renderedFileInfo":null,"shortPath":null,"tabSize":8,"topBannersInfo":{"overridingGlobalFundingFile":false,"globalPreferredFundingPath":null,"repoOwner":"pytorch","repoName":"serve","showInvalidCitationWarning":false,"citationHelpUrl":"https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/creating-a-repository-on-github/about-citation-files","showDependabotConfigurationBanner":false,"actionsOnboardingTip":null},"truncated":false,"viewable":false,"workflowRedirectUrl":null,"symbols":null},"copilotInfo":null,"csrf_tokens":{"/pytorch/serve/branches":{"post":"ECZfmObnHcYLOJ2uPtqVaZXc9eEIPa6maIerS4XkOkTSFKty1smFOkDFMjgQMrw-uRcHss-rMyp95gd_WPEaWA"},"/repos/preferences":{"post":"by9ud_BrO1Pdnb-gJ4FJ5rYfsCzSZ9_Em-Xln1JIHszB5oyavOKGfUgdA_ODbpMqecp0qza-bGWcte23S6fkHA"}}},"title":"serve/examples/image_classifier/mnist/test_data/0.png at master \xc2\xb7 pytorch/serve"}')}]

As you can see, there is no "data" key in the above json object which actually has my base64 image or any byte-array.

I tried passing my data two ways:

I have initially played with the default mnist example, and it worked without any hiccups. My custom-handler code is as follows:

from ts.torch_handler.base_handler import BaseHandler
import torch, torchvision
import os
from torchvision import transforms
import torch.nn.functional as F
from torch.profiler import ProfilerActivity
import io
import base64
from PIL import Image

class ModelHandler(BaseHandler):
    """
    A custom model handler implementation.
    """

    def __init__(self):
        self.model = None
        self.mapping = None
        self.device = None
        self.initialized = False
        self.context = None
        self.model_pt_path = None
        self.manifest = None
        self.map_location = None
        self.explain = False
        self.target = 0
        self.preprocess = None
        self.profiler_args = {}

    def initialize(self, context):
        """
        Initialize model. This will be called during model loading time
        :param context: Initial context contains model server system properties.
        :return:
        """
        if context is not None and hasattr(context, "model_yaml_config"):
            self.model_yaml_config = context.model_yaml_config

        properties = context.system_properties
        print(properties)
        if torch.cuda.is_available() and properties.get("gpu_id") is not None:
            self.map_location = "cuda"
            self.device = torch.device(
                self.map_location + ":" + str(properties.get("gpu_id"))
            )
        elif XLA_AVAILABLE:
            self.device = xm.xla_device()
        else:
            self.map_location = "cpu"
            self.device = torch.device(self.map_location)

        self.manifest = context.manifest
        model_dir = properties.get("model_dir")
        print("Manifest: ", self.manifest)
        print("Model Directory: ", model_dir)

        # loading the model
        model_wts_name = "vit_l_16.pt"
        model_weights_path = os.path.join(model_dir, model_wts_name)

        self.model = torchvision.models.vit_l_16()
        self.model.load_state_dict(torch.load(model_weights_path))
        self.model.to(self.device)
        self.model.eval()
        print("Model Loaded Successfully....")

        self.preprocess = torchvision.models.ViT_L_16_Weights.IMAGENET1K_V1.transforms()
        self.preprocess.antialias = True

    def data_preprocess(self, data):
        """
        Transform raw input into model input data.
        :param batch: list of raw requests, should match batch size
        :return: list of preprocessed model input data
        """

        images = []
        for row in data:
            # Compat layer: normally the envelope should just return the data
            # directly, but older versions of Torchserve didn't have envelope.
            image = row.get("data") or row.get("body")
            print("IS IMAGE STR: ", type(image))
            print("Image: ", image)
            if isinstance(image, str):
                # if the image is a string of bytesarray.
                image = base64.b64decode(image)

            # If the image is sent as bytesarray
            if isinstance(image, (bytearray, bytes)):
                image = Image.open(io.BytesIO(image))
                print("THIS IS BYTE ARRAY")
            else:
                # if its a string
                print("THIS IS FLOAT STRING")
                image = torch.FloatTensor(image)

            print(image)
            print(image.shape)
            # model specific pre-processing
            image = self.preprocess(image)

            images.append(image)

        return torch.stack(images).to(self.device)

    def inference(self, model_input):
        """
        Internal inference methods
        :param model_input: transformed model input data
        :return: list of inference output in NDArray
        """
        # Do some inference call to engine here and return output
        with torch.no_grad():
            model_output = self.model(data)
        return model_output

    def postprocess(self, inference_output):
        """
        Return inference result.
        :param inference_output: list of inference output
        :return: list of predict results
        """
        # Take output from network and post-process to desired format
        ps = F.softmax(data, dim=1)
        return torch.argmax(ps).item()

    def handle(self, data, context):
        """
        Invoke by TorchServe for prediction request.
        Do pre-processing of data, prediction using model and postprocessing of prediciton output
        :param data: Input data for prediction
        :param context: Initial context contains model server system properties.
        :return: prediction output
        """
        print(type(data))
        print("data: ", data)
        print("Context: ", context)
        model_input = self.data_preprocess(data)
        model_output = self.inference(model_input)
        return self.postprocess(model_output)

    def get_insights(self, tensor_data, _, target=0):
        print("input shape", tensor_data.shape)
        return self.ig.attribute(tensor_data, target=target, n_steps=15).tolist()

Can you please help me here. Thanks in advance.

mreso commented 9 months ago

Hi, can you please check your 0.png file? Did you by any chance curl/wget that from github? Because that json you see in your handler looks pretty much identical to the one you get when you curl -o 0.png https://github.com/pytorch/serve/blob/master/examples/image_classifier/mnist/test_data/0.png

image

You need to use the raw file link when downloading files from github through curl/wget: image

yogendra-yatnalkar commented 9 months ago

Hi @mreso, you are right. Thanks for pointing out my mistake. I was load-testing with torchserve using image which I just curled from github. Now I created a blank image using numpy directly and it has started working now. Thanks for the support.