live_demo.ipynb - 12 hours (and counting) to optimise model on Jetson Nano - is this normal?

NVIDIA-AI-IOT / trt_pose

Real-time pose estimation accelerated with NVIDIA TensorRT

MIT License

973 stars 291 forks source link

live_demo.ipynb - 12 hours (and counting) to optimise model on Jetson Nano - is this normal? #72

Open jsynnott opened 3 years ago

jsynnott commented 3 years ago

Hi everyone, I have recently setup trt_pose on a fresh Jetson Nano, all requirements installed including PyTorch v1.6 and torchvision v0.7.0 on Jetpack 4.4.

I am trying to run the live demo in jupyter notebook. The following line has taken 12 hours so far:

model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workplace_size=1<25)

Is this normal? Surely it can't be?

jsynnott commented 3 years ago

Just an update. I killed the process as it didn't appear to be doing anything. I have managed to complete the model optimisation by:

Installing https://pypi.org/project/jetson-stats/ and using it to enable 6GB of swap memory.
Replacing the original line outlined in the original post, with the following (the line is the same, just wrap it with the 'with' statement:

with torch.cuda.device(0):
    model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workplace_size=1<25)

SijinJohn commented 3 years ago

Hi even I have this problem. It is taking too long and the system is stuck. I also got a low memory warning on the Jetson nano.What to do???

tucachmo2202 commented 3 years ago

Hi, I think you should export the torch model to onnx first then convert onnx to tensorrt engine.