Deci-AI / super-gradients

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
https://www.supergradients.com
Apache License 2.0
4.58k stars 509 forks source link

python TFlite or onnx inference script #911

Closed ajithvcoder closed 1 year ago

ajithvcoder commented 1 year ago

Is your feature request related to a problem?

No. i am able to convert .pth to tflite models i need a starter script which can be used for tflite inference

Describe the solution you'd like

A standalone python inference script

Describe alternatives you've considered

if you have onnx inference script also it would be helpful

dagshub[bot] commented 1 year ago

Join the discussion on DagsHub!

avideci commented 1 year ago

Hi @ajithvallabai, Infery - Deci's runtime inference engine for Python - is capable of performing inference using few lines of code, in any framework.

The official examples are at https://github.com/Deci-AI/infery-examples.

Your can find the TFlite example here:

Basic Examples (Copy-Paste scripts)

Advanced Examples

Application Examples

Custom Hardware Examples


ajithvcoder commented 1 year ago

hi @avideci , Thanks for the suggestion i was able get predictions using infery(), could you give me any reference for postprocessing step for below script. I know that i can use model.predict() and get the output bboxes etc.. but i want to go by this method because i need to write postprocessing for tflite, onnx model While using below method i get raw_predictions[0][0] and raw_predictions[0][1] of shapes (1,8400,4),(1,8400,80) how to do postprocessing for this ? should i use YoloPostPredictionCallback() if so how to use it, for below case i am getting error

import requests
from PIL import Image
from io import BytesIO
import numpy as np

# Get PIL image
image = Image.open("./img.jpg")

import torchvision.transforms as transforms
import torch

preprocess = transforms.Compose([
    transforms.Resize([640, 640]),
    transforms.PILToTensor()
])

# Run preprocess on image. unsqueeze for [Batch x Channels x Width x Height] format
transformed_image = preprocess(image).float().unsqueeze(0)
# print(transformed_image)
transformed_image

from super_gradients.training import models
from super_gradients.common.object_names import Models

# Get pretrained model from super-gradients repository.
model = models.get(Models.YOLO_NAS_S, pretrained_weights="coco", num_classes=80)

model.eval()

# Predict using SG model
with torch.no_grad():
  raw_predictions = model(transformed_image)

print(raw_predictions[0][0].shape)
print(raw_predictions[0][1].shape)
print(raw_predictions)

from super_gradients.training.models.detection_models.yolo_base import YoloPostPredictionCallback
predictions = YoloPostPredictionCallback(conf=0.1, iou=0.4)(raw_predictions)[0].numpy()

error:

File "~/yolo_pretrained_model_predict.py", line 40, in <module>
    predictions = YoloPostPredictionCallback(conf=0.1, iou=0.4)(raw_predictions)[0].numpy()
  File "/home/mcw/anaconda3/envs/yolo_nas/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/site-packages/super_gradients/training/models/detection_models/yolo_base.py", line 96, in forward
    nms_result = non_max_suppression(x[0], conf_thres=self.conf, iou_thres=self.iou, with_confidence=self.with_confidence)
  File "~/site-packages/super_gradients/training/utils/detection_utils.py", line 249, in non_max_suppression
    candidates_above_thres = prediction[..., 4] > conf_thres  # filter by confidence
TypeError: tuple indices must be integers or slices, not tuple
NatanBagrov commented 1 year ago

Hi, @ajithvallabai ,

I'll first elaborate on the shapes you get, then discuss on how to run e2e procedure. Given 2 tensors of shapes (1, 8400, 4), (1, 8400, 80), these tensors correspond to the box coords (4), and classes probability (80 classes). The number 8400, is actually a sum of 3 different grid resolutions (640/8, 640/16, 640/32), where each grid gives a prediction, that is: 640/8 ^ 2 + 640/16 ^ 2 + 640/32 ^ 2 = 8400. Hope that makes sense.

The suggested way to do end-to-end predictions is by using: model.predict(...) method. Please read more in the quickstart.

Let me know if that helps!

ajithvcoder commented 1 year ago

thanks for the explanation

mazatov commented 1 year ago

@ajithvallabai , @NatanBagrov , I was following the colab notebook provided and ran into the same error in the last cell.

https://colab.research.google.com/drive/1y3WizmNlta6J6BWm46ZfFw1fqpFBSyYM?usp=sharing#scrollTo=Bgn2r7ycaIML

Is YoloPostPredictionCallback not to be used anymore?