Hello author! Error while training scratch yolact_edge+ model with config "yolact_edge_plus_config" contains backbone "resnet101_dcn_inter3_backbone".

WisconsinAIVision / yolact_edge

The first competitive instance segmentation approach that runs on small edge devices at real-time speeds.

MIT License

1.27k stars 273 forks source link

Hello author! Error while training scratch yolact_edge+ model with config "yolact_edge_plus_config" contains backbone "resnet101_dcn_inter3_backbone". #206

Open LamThanhNguyen opened 2 years ago

LamThanhNguyen commented 2 years ago

Describe the bug RuntimeError: The expanded size of tensor (57744) must match the existing size (19248) at non-singleton dimension 1. Target sizes: [8, 57744, 4]. Tensor sizes: [8, 19248, 1].

To Reproduce CUDA_VISIBLE_DEVICES=1,2 python train.py --batch_size=8 --validation_epoch=1 --num_gpus=2 --learning_rate=0.001 --momentum=0.9 --decay=0.0005 --log_folder="logs_model_tomo_0/logs/" --save_folder="logs_model_tomo_0/weights/"

Expected behavior Can you share me file pretrained-model backbone ResNet101_DCN_Interval3.

Full logs

Environment:

OS: [e.g. Ubuntu 18.04]
GPU: [e.g. RTX 2080 Ti *2]
CUDA Version [e.g. 10.2]

andreazuna89 commented 1 year ago

Hi, I got the same error while running training for yolact_edge_plus:

The expanded size of the tensor (57744) must match the existing size (19248) at non-singleton dimension 1. Target sizes: [8, 57744, 4]. Tensor sizes: [8, 19248, 1]

Did you solve the issue?

Thanks

LamThanhNguyen commented 1 year ago

Hi, I have solved this problem. And pulled a request to fix this. You can refer to this link: https://github.com/haotian-liu/yolact_edge/pull/216

andreazuna89 commented 1 year ago

Hi, thanks for the fixing. Now the training for yolact edge plus is working. Once trained, I have tried to test the trained model with eval script but I get another error, while TensorRT conversion:

ERROR: failed to load libamirstan_plugin.soDid you forget to do a "make" in the "amirstan_plugin/" subdirectory?

It seems strange because I have correctly installed the lib and exported the path since the training also is finalized well. Can you help me here?

Thanks Andrea

LamThanhNguyen commented 1 year ago

https://github.com/haotian-liu/yolact_edge/blob/master/INSTALL.md

You need to carefully read item 5 in the file INSTALL.md. Make sure you build and install amirstan_plugin successful. Or you can inference model yolact_edge_plus enable TensorRT dynamic but disable deformable convolution.

Glad you asked the question!