deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.07k stars 648 forks source link

If I modify the network structure of the target detection model, can I load my own model? #1677

Open Xhran opened 2 years ago

Xhran commented 2 years ago

I have modified a custom target detection network and trained it to get pytorch weights, can I load this model for inference using DJL? How should I do it? Thank you!

zachgk commented 2 years ago

Yeah. You can follow the instructions to export your model as torchscript and then load it following this example notebook. As part of this, you may need to implement a Translator class with the pre-processing and post-processing that your model uses if it doesn't use exactly the same pre/post-processing as a Translator that we already have implemented

Xhran commented 2 years ago

Yeah. You can follow the instructions to export your model as torchscript and then load it following this example notebook. As part of this, you may need to implement a Translator class with the pre-processing and post-processing that your model uses if it doesn't use exactly the same pre/post-processing as a Translator that we already have implemented

When I load the exported torchscript model file, I get the following error. What is the problem? Thank you!

ai.djl.translate.TranslateException: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/models/yolo.py", line 71, in forward _35 = (_20).forward(_34, ) _36 = (_22).forward((_21).forward(_35, ), _29, ) _37 = (_24).forward(_33, _35, (_23).forward(_36, ), )


    _38, _39, _40, _41, = _37
    return (_41, [_38, _39, _40])
  File "code/__torch__/models/yolo.py", line 102, in forward
    y = torch.sigmoid(_22)
    _23 = torch.mul(torch.slice(y, 4, 0, 2), CONSTANTS.c0)
    _24 = torch.add(torch.sub(_23, CONSTANTS.c1), CONSTANTS.c2)
          ~~~~~~~~~ <--- HERE
    xy = torch.mul(_24, torch.select(CONSTANTS.c3, 0, 0))
    _25 = torch.mul(torch.slice(y, 4, 2, 4), CONSTANTS.c4)

Traceback of TorchScript, original code (most recent call last):
/content/drive/My Drive/pytorch/yolov5-6.0/models/yolo.py(68): forward
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py(1090): _slow_forward
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py(1102): _call_impl
/content/drive/My Drive/pytorch/yolov5-6.0/models/yolo.py(149): _forward_once
/content/drive/My Drive/pytorch/yolov5-6.0/models/yolo.py(126): forward
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py(1090): _slow_forward
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py(1102): _call_impl
/usr/local/lib/python3.7/dist-packages/torch/jit/_trace.py(965): trace_module
/usr/local/lib/python3.7/dist-packages/torch/jit/_trace.py(750): trace
export.py(56): export_torchscript
export.py(304): run
/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py(28): decorate_context
export.py(359): main
export.py(364): <module>
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
lanking520 commented 2 years ago

@Xhran your problem can be solved by adding this to the Criteria:

.optOption("mapLocation", "true")

Sample: https://docs.djl.ai/master/docs/demos/jupyter/load_pytorch_model.html#step-3-load-your-model