ZhengPeng7 / BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
https://www.birefnet.top
MIT License
1.19k stars 91 forks source link

如何以小尺寸进行推理?而不是1024 #107

Open juntaosun opened 1 week ago

juntaosun commented 1 week ago

BiRefNet_lite 默认的输入尺寸是 1024x1024。 如果推理小尺寸的图片(低于1024),它会缩放到该 1024 尺寸。

问题: 该模型支持输入小尺寸比如 768x768 或者 512x512 进行推理吗?如何修改 改小会不会提高一些速度呢,谢谢~

ZhengPeng7 commented 1 week ago

嗯, 当然支持, 修改在inference那里的data loader的resize预处理就可以啦.

juntaosun commented 1 week ago

@ZhengPeng7 我用 BiRefNet_pth2onnx.ipynb 成功导出了 onnx。

遇到了新的问题: (1)虽然onnx添加了动态输入;但在推理时,尺寸仍被固定,模型无法适应动态输入。


input = torch.randn(1, 3, 512, 512).to(device)

torch.onnx.export(
            net,
            (input),
            file_name,
            verbose=False,
            opset_version=17,
            input_names=['input_image'],
            output_names=['output_image'],
            dynamic_axes={  # 导出时设置动态尺寸dynamic_axes
                        "input_image":   {0: 'b', 1: '3',  2: "h", 3: "w"},  
                        "output_image":  {0: 'b', 1: '1',  2: "h", 3: "w"},
                        },

在用其它尺寸推理时,就会报错必需是固定尺寸512(动态尺寸dynamic_axes不起作用)。 [E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running Split node. Name:'/decoder/Split_64' Status Message: Cannot split using values in 'split' attribute. Axis=-1 Input shape={3,1024,1024} NumOutputs=1 Num entries in 'split' (must equal number of outputs) was 1 Sum of sizes in 'split' (must equal size of selected axis) was 512

希望能修复一下, 支持onnx动态尺寸输入,谢谢~

ZhengPeng7 commented 1 week ago

感谢指出和尝试! 但是我对ONNX其实之前也不熟, 这个我刚刚收了似乎做法就是想你说的这样, 看error似乎是torch.split()这个函数的问题. 我后面有时间尽量去修复下哈, 但是请暂时不要抱希望... 如果有进展我会@回复你的哈.

minushuang commented 2 days ago

嗯, 当然支持, 修改在inference那里的data loader的resize预处理就可以啦.

Hello @ZhengPeng7 ,must the inference input size be the same as the training input size? Can they be different? eg. 1600x1200 for inference when training is 1024x1024

ZhengPeng7 commented 2 days ago

Sure. But if you use ONNX, there might be some restrictions on it when exporting the ONNX models.