Errors when using yolov5 translate.

CensorKo commented 3 years ago

Description

First, I export my model from yolov5.

export command: python export.py --weights ./runs/train/exp26/weights/best.pt --img 640 --batch 1 --include torchscript

And then my djl code is: Translator<Image, DetectedObjects> translator = YoloV5Translator.builder().optSynsetArtifactName("coco.names").build(); Criteria<Image, DetectedObjects> criteria = Criteria.builder() .setTypes(Image.class, DetectedObjects.class) .optDevice(Device.cpu()) .optModelUrls(Main.class.getResource("/yolov5s").getPath()) .optModelName("best.torchscript.pt") .optTranslator(translator) .optEngine("PyTorch") .build();

when execute: ZooModel<Image, DetectedObjects> model = ModelZoo.loadModel(criteria); Got errors.

Error Message

[W TensorImpl.h:1156] Warning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (function operator()) ai.djl.translate.TranslateException: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/models/yolo.py", line 46, in forward _35 = (_4).forward(_34, ) _36 = (_2).forward((_3).forward(_35, ), _29, ) _37 = (_0).forward(_33, _35, (_1).forward(_36, ), )


    _38, _39, _40, _41, = _37
    return (_41, [_38, _39, _40])
  File "code/__torch__/models/yolo.py", line 75, in forward
    y = torch.sigmoid(_50)
    _51 = torch.mul(torch.slice(y, 4, 0, 2), CONSTANTS.c0)
    _52 = torch.add(torch.sub(_51, CONSTANTS.c1), CONSTANTS.c2)
          ~~~~~~~~~ <--- HERE
    xy = torch.mul(_52, torch.select(CONSTANTS.c3, 0, 0))
    _53 = torch.mul(torch.slice(y, 4, 2, 4), CONSTANTS.c4)

Traceback of TorchScript, original code (most recent call last):
/data1/yolov5_aws/yolov5/models/yolo.py(66): forward
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/nn/modules/module.py(1051): _call_impl
/data1/yolov5_aws/yolov5/models/yolo.py(155): forward_once
/data1/yolov5_aws/yolov5/models/yolo.py(123): forward
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/jit/_trace.py(959): trace_module
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/jit/_trace.py(744): trace
export.py(35): export_torchscript
export.py(154): run
export.py(187): main
export.py(192): <module>
RuntimeError: The size of tensor a (180) must match the size of tensor b (80) at non-singleton dimension 3

 at ai.djl.inference.Predictor.batchPredict(Predictor.java:170)
 at ai.djl.inference.Predictor.predict(Predictor.java:118)
 at xyz.hyhy.Main.main(Main.java:46)
Caused by: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/models/yolo.py", line 46, in forward
    _35 = (_4).forward(_34, )
    _36 = (_2).forward((_3).forward(_35, ), _29, )
    _37 = (_0).forward(_33, _35, (_1).forward(_36, ), )
           ~~~~~~~~~~~ <--- HERE
    _38, _39, _40, _41, = _37
    return (_41, [_38, _39, _40])
  File "code/__torch__/models/yolo.py", line 75, in forward
    y = torch.sigmoid(_50)
    _51 = torch.mul(torch.slice(y, 4, 0, 2), CONSTANTS.c0)
    _52 = torch.add(torch.sub(_51, CONSTANTS.c1), CONSTANTS.c2)
          ~~~~~~~~~ <--- HERE
    xy = torch.mul(_52, torch.select(CONSTANTS.c3, 0, 0))
    _53 = torch.mul(torch.slice(y, 4, 2, 4), CONSTANTS.c4)

Traceback of TorchScript, original code (most recent call last):
/data1/yolov5_aws/yolov5/models/yolo.py(66): forward
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/nn/modules/module.py(1051): _call_impl
/data1/yolov5_aws/yolov5/models/yolo.py(155): forward_once
/data1/yolov5_aws/yolov5/models/yolo.py(123): forward
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/jit/_trace.py(959): trace_module
/root/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/jit/_trace.py(744): trace
export.py(35): export_torchscript
export.py(154): run
export.py(187): main
export.py(192): <module>
RuntimeError: The size of tensor a (180) must match the size of tensor b (80) at non-singleton dimension 3

 at ai.djl.pytorch.jni.PyTorchLibrary.moduleForward(Native Method)
 at ai.djl.pytorch.jni.IValueUtils.forward(IValueUtils.java:46)
 at ai.djl.pytorch.engine.PtSymbolBlock.forwardInternal(PtSymbolBlock.java:126)
 at ai.djl.nn.AbstractBlock.forward(AbstractBlock.java:126)
 at ai.djl.nn.Block.forward(Block.java:122)
 at ai.djl.inference.Predictor.predict(Predictor.java:123)
 at ai.djl.inference.Predictor.batchPredict(Predictor.java:163)
 ... 2 more
Disconnected from the target VM, address: '127.0.0.1:56512', transport: 'socket'

frankfliu commented 3 years ago

Can you try use python to run your torch script model? What's the expected input shapes?

chengpengvb commented 2 years ago

Build like this: Pipeline pipeline = new Pipeline(); pipeline.add(new Resize(640, 640)); pipeline.add(new ToTensor());

        Translator<Image, DetectedObjects> translator = YoloV5Translator.builder().setPipeline(pipeline)
                .optSynsetArtifactName("coco.names").optThreshold(0.5f).build();

new Resize(640, 640) Set and export size the same

deepjavalibrary / djl

Errors when using yolov5 translate. #1211

Description

Error Message