How to use yolov7 of PyTorch framework in DJL

WwContinue commented 1 year ago

After saving the pt file with torch.jit.trace, read the pt file with optModelName in the DJL, and encounter ai.djl.engine.EngineException: the following operation failed in The TorchScript interpreter error when using DetectedObjects detection = predictor.predict(img).

frankfliu commented 1 year ago

@WwContinue Can you share the full stack trace?

WwContinue commented 1 year ago

@WwContinue Can you share the full stack trace?

Thanks, I'm not sure what the full stack trace is. Here is the code I saved the pt file:

        Net.eval()
        traced_script_module = torch.jit.trace(Net, images, strict=False)
        traced_script_module.save('Test.pt')

The saved pt file can be read and run normally in PyTorch. However, the following errors occur in the DJL. I'm not sure whether they are caused by the input problem or the pt file problem:

Exception in thread "main" ai.djl.translate.TranslateException: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.
    Traceback of TorchScript, serialized code (most recent call last):
    File "code/__torch__/nets/yolo.py", line 61, in forward
    stem = backbone3.stem
    _0 = (dark2).forward(act0, (stem).forward(act0, x, ), )
    _1 = (dark3).forward(act0, _0, )
    ~~~~~~~~~~~~~~ <--- HERE
    _2 = (dark4).forward(act0, _1, )
    _3 = (sppcspc).forward(act0, (dark5).forward(act0, _2, ), )
    File "code/__torch__/torch/nn/modules/container/___torch_mangle_63.py", line 13, in forward
    _1 = getattr(self, "1")
    _0 = getattr(self, "0")
    _2 = (_0).forward(argument_1, argument_2, )
    ~~~~~~~~~~~ <--- HERE
    return (_1).forward(argument_1, _2, )
    File "code/__torch__/nets/backbone.py", line 77, in forward
    _6 = (cv2).forward(argument_1, argument_2, )
    _7 = [(cv3).forward(argument_1, _6, ), _5]
    return torch.cat(_7, 1)
    ~~~~~~~~~ <--- HERE
class MP(Module):
    __parameters__ = []

Traceback of TorchScript, original code (most recent call last):
        ......

        RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 38 but got size 37 for tensor number 1 in the list.

frankfliu commented 1 year ago

Looks like your input shape is different from what the model expected.

You need to make sure your image processing code generate the tensor matches the model.

WwContinue commented 1 year ago

Thank you for your answer! I try to solve this problem in this direction.

deepjavalibrary / djl

How to use yolov7 of PyTorch framework in DJL #2270