WisconsinAIVision / yolact_edge

The first competitive instance segmentation approach that runs on small edge devices at real-time speeds.
MIT License
1.28k stars 273 forks source link

RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D :getting this error for better mAP #52

Closed kashzade closed 3 years ago

kashzade commented 3 years ago

First of all thanks to all developers who build the best model.

I have trained Yolact Edge on a single object. When I try to inference using a trained model with around 50 mAP then it starts prediction(prediction very bad) but if the model mAP around 80 or high then it through a bellow error.

Traceback (most recent call last): File "eval.py", line 1246, in evaluate(net, dataset) File "eval.py", line 894, in evaluate evalvideo(net, args.video) File "eval.py", line 777, in evalvideo frame_buffer.put(frame['value'].get()) File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get raise self._value File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "eval.py", line 699, in prep_frame return prep_display(preds, frame, None, None, undo_transform=False, class_color=True) File "eval.py", line 167, in prep_display score_threshold = args.score_threshold) File "/home/ubuntu_pc//yolact_edge/layers/output_utils.py", line 103, in postprocess masks = proto_data @ masks.t() RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D

I have changed the dimension of tensor 3D to 2D but getting multiple errors.

If anyone has a solution then please share.

haotian-liu commented 3 years ago

Hi, can you share the command and the config you were using?

kashzade commented 3 years ago

Thanks for your support haotian-Iiu

Did changes in config.py file

added my_custom_dataset(...) below the dataset_base = Config({...})

my_custom_dataset = dataset_base.copy({
    'name': 'My Dataset',

     'train_images': '/home/ubuntu_pc/dataset/train',
     'train_info':   '/home/ubuntu_pc/dataset/train/laptop_coco.json',

     'valid_images': '/home/ubuntu_pc/dataset/val',
     'valid_info':   '/home/ubuntu_pc/dataset/val/laptop_coco.json',

    'has_gt': True,
    'class_names': ('laptop',)
})

also changed the value of 'dataset' variable and replaced coco2017_dataset by my_custom_dataset

yolact_base_config = coco_base_config.copy({
    'name': 'yolact_base',

    # Dataset stuff
    #'dataset': coco2017_dataset,
    'dataset': my_custom_dataset,
    'num_classes': len(my_custom_dataset.class_names) + 1, ## 1 + 1
.
.
.
})
haotian-liu commented 3 years ago

The config seems fine. Were you using the TensorRT, and if you are using the latest code base?

kashzade commented 3 years ago

yes, TensorRT(7.2.2.3) and the Latest code

haotian-liu commented 3 years ago

This is weird. Can you print out the shape of the tensors in masks = proto_data @ masks.t() for both TensorRT enabled and disabled?

kashzade commented 3 years ago

With TensorRT

Model mAP:78 Image Inference: python3 eval.py --trained_model=weights/yolact_edge_7_544_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg

print(masks.shape) Output >> torch.Size([1, 1, 32])

print(masks.t().shape) Output >> .... RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D

Model mAP:45 Image Inference: python3 eval.py --trained_model=weights/yolact_edge_4_76_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg

print(masks.shape) Output >> torch.Size([5, 32]) ### show 5 mask

print(masks.t().shape) Output >> torch.Size([32, 5])

mask = proto_data @ masks.t()
print(masks.shape)

Output >> torch.Size([138, 138, 5]) ### show 5 mask

Without TensorRT

Model mAP:78 Image Inference: python3 eval.py --disable_tensorrt --trained_model=weights/yolact_edge_7_544_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg

print(masks.shape) Output >> torch.Size([1, 32])

print(masks.t().shape) Output >> torch.Size([32, 1])

mask = proto_data @ masks.t()
print(masks.shape)

Output >> torch.Size([138, 138, 1]) ### show 1 mask

Model mAP:45 Image Inference: python3 eval.py --disable_tensorrt --trained_model=weights/yolact_edge_4_76_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg

print(masks.shape) Output >> torch.Size([4, 32]) ### show 4 mask

print(masks.t().shape) Output >> torch.Size([32, 4])

mask = proto_data @ masks.t()
print(masks.shape)

Output >> torch.Size([138, 138, 4]) ### show 4 mask

Note: Without TensorRT model(mAP 78) working well with 38 FPS on RTX 2080

ravijo commented 3 years ago

I too faced the same error and tried to managed it. Please change output_utils.py#L45 with the content below:

dets[k] = torch.index_select(dets[k], 0, keep)

It should work then.

kashzade commented 3 years ago

Thanks for your comment Ravijo, but it gives me the below error >>

 File "/home/ubuntu_pc/yolact_edge/layers/output_utils.py", line 45, in postprocess
    dets[k] = torch.index_select(dets[k], 0, keep)
**RuntimeError: Expected object of scalar type Long but got scalar type Bool for argument #3 'index' in call to _th_index_select**
ravijo commented 3 years ago

Ohhh. I am sorry.

I forgot to tell you that you need to convert keep to long by replacing output_utils.py#L41 with the following content:

keep = dets['score'] > score_threshold
keep = keep.long()
kashzade commented 3 years ago

Ohhh. I am sorry.

I forgot to tell you that you need to convert keep to long by replacing output_utils.py#L41 with the following content:

keep = dets['score'] > score_threshold
keep = keep.long()

Sorry to say but getting below error:

File "/home/ubuntu_pc/yolact_edge/layers/output_utils.py", line 110, in postprocess masks = proto_data @ masks.t() RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)

ravijo commented 3 years ago

This is strange. It worked on my side and I did not face this error.

kashzade commented 3 years ago

This is strange. It worked on my side and I did not face this error.

May I know your TensorRT version

ravijo commented 3 years ago

Sure. Please see here for my environment details.

By the way, I am also facing many issues when using TensorRT and torch2trt module. It seems that torch2trt module has issues (not sure though).

haotian-liu commented 3 years ago

I want to have this kind of issues all fixed ASAP, if it is possible, can you share the trained model (mAP=78 which you claimed to have issues) and the sample image (i don't need training set and annotations) to my email lhtliu at ucdavis.edu in the following command (if it is correct)?

python3 eval.py --trained_model=weights/yolact_edge_7_544_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg
kashzade commented 3 years ago

Thanks, @haotian-liu, and @ravijo for helping me.

The above issue can be solved using cammand: python eval.py --trained_model=./weights/yolact_edge_road_7_544.pth --image=./data/image_files/frame_5.jpg

but when I pass arguments --score_threshold=0.3 --top_k=100 then it through the same above error.

With the cammand python eval.py --trained_model=./weights/yolact_edge_road_7_544.pth --video=./data/video_files/test.mp4 give 70 to 72 FPS on RTX 2080 GPU(input video resolution 1280x720 px),

I am planning to deploy it on Xavier AGX, I hope it will work and will give around 30 FPS.

ravijo commented 3 years ago

@kashzade

Glad to hear from you.

The default value of the parameter score_threshold is 0 and it can be seen here. Similarly, the default value of the parameter top_k can be seen here and it is equal to 5. Probably, top_k = 100 is a too high value.

kashzade commented 3 years ago

@ravijo, As I tested, the issue does not with the value of the top_k parameter, I have tested random values between 1 to 100. The error cause is the score_threshold parameter. It works with only 0 value. tested with 0.01 to 1 but does not work.

ravijo commented 3 years ago

@kashzade

You are right. I just realized it too. Right now I have also set score_threshold to 0 but it is not a good choice though. I have shared my observations here.

haotian-liu commented 3 years ago

@kashzade I set the score_threshold to 0.3 and it still works. Can you provide the command you used that throws the error (is that the same image you sent to me?) and maybe also the error message itself.

haotian-liu commented 3 years ago

@kashzade Also, I implemented a TensorRT safe mode (experimental) to securely handle those TensorRT related issues. Please try out haotian-dev branch, run the evaluation with --use_tensorrt_safe_mode, and see if it helps.

ravijo commented 3 years ago

@haotian-liu and @kashzade

Thank you so much. I have not tried use_tensorrt_safe_mode in the haotian-dev yet.

At present, in my case, when cc_fast_nms is used and score_threshold is non zero, I get the same error, i.e., CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle). The complete description is provided here.

Please let me know if you find out any workaround.

Meanwhile, I am going to try use_tensorrt_safe_mode .

kashzade commented 3 years ago

use_tensorrt_safe_mode

Using this argument, the model is working with 104 FPS for input 1280x720 px video

haotian-liu commented 3 years ago

@kashzade Thank you for reporting back. Btw what device you are working on? 104 FPS seems higher than what we could expect on Jetson devices