Closed kashzade closed 3 years ago
Hi, can you share the command and the config you were using?
Thanks for your support haotian-Iiu
Did changes in config.py file
added my_custom_dataset(...) below the dataset_base = Config({...})
my_custom_dataset = dataset_base.copy({
'name': 'My Dataset',
'train_images': '/home/ubuntu_pc/dataset/train',
'train_info': '/home/ubuntu_pc/dataset/train/laptop_coco.json',
'valid_images': '/home/ubuntu_pc/dataset/val',
'valid_info': '/home/ubuntu_pc/dataset/val/laptop_coco.json',
'has_gt': True,
'class_names': ('laptop',)
})
also changed the value of 'dataset' variable and replaced coco2017_dataset by my_custom_dataset
yolact_base_config = coco_base_config.copy({
'name': 'yolact_base',
# Dataset stuff
#'dataset': coco2017_dataset,
'dataset': my_custom_dataset,
'num_classes': len(my_custom_dataset.class_names) + 1, ## 1 + 1
.
.
.
})
The config seems fine. Were you using the TensorRT, and if you are using the latest code base?
yes, TensorRT(7.2.2.3) and the Latest code
This is weird. Can you print out the shape of the tensors in masks = proto_data @ masks.t()
for both TensorRT enabled and disabled?
Model mAP:78
Image Inference: python3 eval.py --trained_model=weights/yolact_edge_7_544_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg
print(masks.shape)
Output >> torch.Size([1, 1, 32])
print(masks.t().shape)
Output >> .... RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D
Model mAP:45
Image Inference: python3 eval.py --trained_model=weights/yolact_edge_4_76_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg
print(masks.shape)
Output >> torch.Size([5, 32]) ### show 5 mask
print(masks.t().shape)
Output >> torch.Size([32, 5])
mask = proto_data @ masks.t()
print(masks.shape)
Output >> torch.Size([138, 138, 5]) ### show 5 mask
Model mAP:78 Image Inference: python3 eval.py --disable_tensorrt --trained_model=weights/yolact_edge_7_544_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg
print(masks.shape)
Output >> torch.Size([1, 32])
print(masks.t().shape)
Output >> torch.Size([32, 1])
mask = proto_data @ masks.t()
print(masks.shape)
Output >> torch.Size([138, 138, 1]) ### show 1 mask
Model mAP:45
Image Inference: python3 eval.py --disable_tensorrt --trained_model=weights/yolact_edge_4_76_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg
print(masks.shape)
Output >> torch.Size([4, 32]) ### show 4 mask
print(masks.t().shape)
Output >> torch.Size([32, 4])
mask = proto_data @ masks.t()
print(masks.shape)
Output >> torch.Size([138, 138, 4]) ### show 4 mask
Note: Without TensorRT model(mAP 78) working well with 38 FPS on RTX 2080
I too faced the same error and tried to managed it. Please change output_utils.py#L45 with the content below:
dets[k] = torch.index_select(dets[k], 0, keep)
It should work then.
Thanks for your comment Ravijo, but it gives me the below error >>
File "/home/ubuntu_pc/yolact_edge/layers/output_utils.py", line 45, in postprocess
dets[k] = torch.index_select(dets[k], 0, keep)
**RuntimeError: Expected object of scalar type Long but got scalar type Bool for argument #3 'index' in call to _th_index_select**
Ohhh. I am sorry.
I forgot to tell you that you need to convert keep
to long
by replacing output_utils.py#L41 with the following content:
keep = dets['score'] > score_threshold
keep = keep.long()
Ohhh. I am sorry.
I forgot to tell you that you need to convert
keep
tolong
by replacing output_utils.py#L41 with the following content:keep = dets['score'] > score_threshold keep = keep.long()
Sorry to say but getting below error:
File "/home/ubuntu_pc/yolact_edge/layers/output_utils.py", line 110, in postprocess
masks = proto_data @ masks.t()
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)
This is strange. It worked on my side and I did not face this error.
This is strange. It worked on my side and I did not face this error.
May I know your TensorRT version
Sure. Please see here for my environment details.
By the way, I am also facing many issues when using TensorRT and torch2trt
module. It seems that torch2trt
module has issues (not sure though).
I want to have this kind of issues all fixed ASAP, if it is possible, can you share the trained model (mAP=78 which you claimed to have issues) and the sample image (i don't need training set and annotations) to my email lhtliu at ucdavis.edu
in the following command (if it is correct)?
python3 eval.py --trained_model=weights/yolact_edge_7_544_interrupt.pth --score_threshold=0.3 --top_k=100 --image=/home/ubuntu_pc/dataset/val/frame_5.jpg
Thanks, @haotian-liu, and @ravijo for helping me.
The above issue can be solved using cammand:
python eval.py --trained_model=./weights/yolact_edge_road_7_544.pth --image=./data/image_files/frame_5.jpg
but when I pass arguments --score_threshold=0.3 --top_k=100
then it through the same above error.
With the cammand
python eval.py --trained_model=./weights/yolact_edge_road_7_544.pth --video=./data/video_files/test.mp4
give 70 to 72 FPS on RTX 2080 GPU(input video resolution 1280x720 px),
I am planning to deploy it on Xavier AGX, I hope it will work and will give around 30 FPS.
@ravijo, As I tested, the issue does not with the value of the top_k
parameter, I have tested random values between 1 to 100. The error cause is the score_threshold
parameter. It works with only 0 value. tested with 0.01 to 1 but does not work.
@kashzade
You are right. I just realized it too. Right now I have also set score_threshold
to 0 but it is not a good choice though. I have shared my observations here.
@kashzade I set the score_threshold to 0.3 and it still works. Can you provide the command you used that throws the error (is that the same image you sent to me?) and maybe also the error message itself.
@kashzade Also, I implemented a TensorRT safe mode (experimental) to securely handle those TensorRT related issues. Please try out haotian-dev
branch, run the evaluation with --use_tensorrt_safe_mode
, and see if it helps.
@haotian-liu and @kashzade
Thank you so much. I have not tried use_tensorrt_safe_mode
in the haotian-dev
yet.
At present, in my case, when cc_fast_nms
is used and score_threshold
is non zero, I get the same error, i.e., CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)
. The complete description is provided here.
Please let me know if you find out any workaround.
Meanwhile, I am going to try use_tensorrt_safe_mode
.
use_tensorrt_safe_mode
Using this argument, the model is working with 104 FPS for input 1280x720 px video
@kashzade Thank you for reporting back. Btw what device you are working on? 104 FPS seems higher than what we could expect on Jetson devices
First of all thanks to all developers who build the best model.
I have trained Yolact Edge on a single object. When I try to inference using a trained model with around 50 mAP then it starts prediction(prediction very bad) but if the model mAP around 80 or high then it through a bellow error.
Traceback (most recent call last): File "eval.py", line 1246, in
evaluate(net, dataset)
File "eval.py", line 894, in evaluate
evalvideo(net, args.video)
File "eval.py", line 777, in evalvideo
frame_buffer.put(frame['value'].get())
File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "eval.py", line 699, in prep_frame
return prep_display(preds, frame, None, None, undo_transform=False, class_color=True)
File "eval.py", line 167, in prep_display
score_threshold = args.score_threshold)
File "/home/ubuntu_pc//yolact_edge/layers/output_utils.py", line 103, in postprocess
masks = proto_data @ masks.t()
RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D
I have changed the dimension of tensor 3D to 2D but getting multiple errors.
If anyone has a solution then please share.