关于mmdetection微调之后 Grounding DINO 无法tensort加速问题

xiyangyang99 commented 8 months ago

利用mmdetection微调自己的本地数据集之后，GroundingDINO在本地推理的时间为200-300毫秒，转为onnx推理时间为在cpu上5s一张，在GPU上推理onnx大约2s一张图像，当把onnx模型转为tensorrt模型，官方的trtexec 工具无法转换为--fp16 的trt模型。请问 mmdetection有准备将groundingDINO加速的打算吗？

xiyangyang99 commented 6 months ago

你的onnx转tensorrt是怎么转的？用的工具嘛？我这边是可以微调自己的数据集的。 … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.**>; 发送时间: 2024年2月23日(星期五) 上午10:04 @.**>; @.**>;"State @.**>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这是我的导出脚本export_openvino.py，脚本中加载模型那里改为了model.load_state_dict(clean_state_dict(checkpoint))。pytorch模型在附件中，模型是mmdetection微调自己数据集之后的pth，caption="pressure gauge ." (自己数据集的text)，config脚本就是GroundingDINO_SwinT_OGC.py。 … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午5:33 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog . … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 代码打包来看看，有空帮你跑下 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 从QQ邮箱发来的超大附件 weights.pth (1.98G, 2024年02月10日 17:47 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=213961378d4f4db31b37e6251f62001d194d50030a0651064c5b04005f4f570402584c005b06041f050b54050c5b5703040b5752396932450450065f4d111c421551610a&t=exs_ftn_download&code=a9a79b22 兄弟，mmdet微调的grounding dino模型转onnx 你成功了吗，grounding dino官方的模型转onnx和tensorrt推理我都跑通了的，但是mmdet训练的还没有成功。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: _@**_.***>

用的tensorrt python库，不过我用的官方没有微调过的模型，mmdet训练的groundingdino模型怎么转onnx啊

你的tensorrt版本是多少？

blacksino commented 5 months ago

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

formance commented 5 months ago

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

太大了，传不上来。。

blacksino commented 5 months ago

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

太大了，传不上来。。

老哥可以给个google drive link或者发我邮箱么：469915440@qq.com，麻烦老哥了

formance commented 5 months ago

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

太大了，传不上来。。

老哥可以给个google drive link或者发我邮箱么：469915440@qq.com，麻烦老哥了

https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了，tokelizer预处理单独完成

xiyangyang99 commented 5 months ago

ok ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年3月19日(星期二) 中午11:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

太大了，传不上来。。

老哥可以给个google drive @.***，麻烦老哥了

https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了，tokelizer预处理单独完成

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>

从QQ邮箱发来的超大附件

grounded_v6.onnx (656.83M, 2024年04月18日 11:44 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=7d65326251f393e5426bb5701130531c401156070355545615565001521d50555c521f060056521e5c00015b5107040b5a570501372061544a0a470c5355056c4e531c0d595e193305&t=exs_ftn_download&code=8e2b70a3

blacksino commented 5 months ago

感谢，请问这是基于swinb的么，还是swin T？

ok ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年3月19日(星期二) 中午11:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。太大了，传不上来。。老哥可以给个google drive @.，麻烦老哥了 https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了，tokelizer预处理单独完成 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 从QQ邮箱发来的超大附件 grounded_v6.onnx (656.83M, 2024年04月18日 11:44 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=7d65326251f393e5426bb5701130531c401156070355545615565001521d50555c521f060056521e5c00015b5107040b5a570501372061544a0a470c5355056c4e531c0d595e193305&t=exs_ftn_download&code=8e2b70a3

感谢，请问这是基于swinb的么，还是swin T？

formance commented 5 months ago

感谢，请问这是基于swinb的么，还是swin T？

ok ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.**>; 发送时间: 2024年3月19日(星期二) 中午11:31 @.**>; @.**>;"State @.**>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。太大了，传不上来。。老哥可以给个google drive @._，麻烦老哥了 https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了，tokelizer预处理单独完成 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: _@_._> 从QQ邮箱发来的超大附件 grounded_v6.onnx (656.83M, 2024年04月18日 11:44 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=7d65326251f393e5426bb5701130531c401156070355545615565001521d50555c521f060056521e5c00015b5107040b5a570501372061544a0a470c5355056c4e531c0d595e193305&t=exs_ftn_download&code=8e2b70a3

感谢，请问这是基于swinb的么，还是swin T？

swin T

Di-Gu commented 2 months ago

BaseBackendModel, torch2onnx两个类都需要做略微的修改

想问一下，mmdetction里面需要改么？

firework-github commented 2 months ago

@wxz1996 老哥你转的TensorRT 40ms 精度咋样，我这边转TensoRT SwinB fp32 在A100上差不多110ms，精度对的上，但fp16和int8精度都对不上

FP32的时候，swinB转成onnx去推理，能和pytorch对齐输出；但是onnx转成trt，推理结果很多异常值，想问下需要改哪些地方呢

levylll commented 2 months ago

@wxz1996 老哥你转的TensorRT 40ms 精度咋样，我这边转TensoRT SwinB fp32 在A100上差不多110ms，精度对的上，但fp16和int8精度都对不上

FP32的时候，swinB转成onnx去推理，能和pytorch对齐输出；但是onnx转成trt，推理结果很多异常值，想问下需要改哪些地方呢

请问这个有什么进展了吗？

open-mmlab / mmdetection

关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 #11342