microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.65k stars 2.93k forks source link

The input tensor cannot be reshaped to the requested shape. #14237

Open shu0o0 opened 1 year ago

shu0o0 commented 1 year ago

Describe the issue

my model is from https://github.com/OFA-Sys/DAFlow/blob/main/models/sdafnet.py. I saw some previous issues like it, but can't be solved, could you please tell me the solution? It's my whole code:

def init_torch_model(): torch_model = SDAFNet_Tryon(ref_in_channel=6) torch_model.load_state_dict(torch.load("ckpt_viton.pt")) torch_model.eval() return torch_model

model = init_torch_model()

x = torch.randn(1, 6, 128, 128) y = torch.randn(1, 3, 128, 128) z = torch.randn(1, 3, 128, 128)

dynamic_axes = {'ref_input': {0: 'batch_size', 1: 'channel_x2', 2: 'height', 3: 'width'}, 'cloth_img': {0: 'batch_size', 1: 'channel', 2: 'height', 3: 'width'}, 'img_agnostic': {0: 'batch_size', 1: 'channel', 2: 'height', 3: 'width'}, 'output': {0: 'batch_size', 1: 'channel', 2: 'height', 3: 'width'}}

with torch.no_grad(): torch.onnx.export( model, (x, y, z), "srcnn.onnx", opset_version=16, input_names=['ref_input', 'cloth_img', 'img_agnostic'], output_names=['output'], dynamic_axes = dynamic_axes)

onnx_model = onnx.load("srcnn.onnx") try: onnx.checker.check_model(onnx_model) except Exception: print("Model incorrect") else: print("Model correct")

def get_opt(): parser = argparse.ArgumentParser() parser.add_argument('--name', type=str, required=True) parser.add_argument('--load_height', type=int, default=256) parser.add_argument('--load_width', type=int, default=192) parser.add_argument('--mode', type=str, default='test') parser.add_argument('--dataset_dir', type=str, default='./data') parser.add_argument('--dataset_imgpath', type=str, default='VITON/VITON_test') parser.add_argument('--dataset_list', type=str, default='VITON/test_unpairs.txt') parser.add_argument('--save_dir', type=str, default='./results/') parser.add_argument('-b', '--batch_size', type=int, default=4) opt = parser.parse_args() return opt

def test(opt): test_dataset = VITONDataset(opt) test_loader = data.DataLoader(test_dataset) with torch.no_grad(): for i, inputs in enumerate(tqdm.tqdm(test_loader)): img_names = inputs['img_name'] cloth_names = inputs['c_name']['paired'] img = inputs['img'] img_agnostic = inputs['img_agnostic'] # Masked model image pose = inputs['pose'] cloth_img = inputs['cloth']['paired'] img = F.interpolate(img, size=(256, 192), mode='bilinear') cloth_img = F.interpolate(cloth_img, size=(256, 192), mode='bilinear')

        img_agnostic = F.interpolate(img_agnostic, size=(256, 192), mode='bilinear')
        pose = F.interpolate(pose, size=(256, 192), mode='bilinear')
        ref_input = torch.cat((pose, img_agnostic), dim=1)
        ref_input = np.array(ref_input)
        cloth_img = np.array(cloth_img)
        img_agnostic = np.array(img_agnostic)
        ort_session = onnxruntime.InferenceSession("srcnn.onnx")
        ort_inputs = {'ref_input': ref_input, 'cloth_img': cloth_img,
                      'img_agnostic': img_agnostic}

        ort_output = ort_session.run(['output'], ort_inputs)[0]

if name == 'main': opt = get_opt() test(opt)

Else parameters are the same as project: https://github.com/OFA-Sys/DAFlow.

Urgency

No response

Target platform

Windows

Build script

.

Error / output

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Loop node. Name:'Loop_702' Status Message: Non-zero status code returned while running Reshape node. Name:'Reshape_719' Status Message: D:\a_work\1\s\onnxruntime\core\providers\cpu\tensor\reshape_helper.h:36 onnxruntime::ReshapeHelper::ReshapeHelper size != 0 && (input_shape.Size() % size) == 0 was false. The input tensor cannot be reshaped to the requested shape. Input shape:{1,6,256,8,6}, requested shape:{-1,256,256,8}

Visual Studio Version

No response

GCC / Compiler Version

No response

skottmckay commented 1 year ago

I believe the issue is the module needs to use a Loop and has dynamic input sizes, which requires using torch scripting to export to onnx.

https://pytorch.org/docs/stable/onnx.html#tracing-vs-scripting

Unfortunately that isn't working well. Some things in ModuleList require an interface to be specified for the torch scripting to work (doable), reversed isn't supported, and neither is converting a dynamically sized list to tuple.

e.g. for first issue

# define interface for items in ModuleList
@torch.jit.interface
class TensorToTensorInterface(torch.nn.Module):
    def forward(self, input: torch.Tensor) -> torch.Tensor:
        pass

# update code to first retrieve item from list and call 'forward'
class RefinePyramid(nn.Module):
    def forward(self, x):
        ....
            # orig: feature: self.adaptive[i](conv_ftr)
            conv2d: TensorToTensorInterface = self.adaptive[i]
            feature = conv2d.forward(conv_ftr)

e.g for second issue

        # torch script doesn't support 'reversed', and can't convert feature_list of dynamic size to tuple
        return tuple(reversed(feature_list))

You could update the model to manually reverse the list and hardcode the length to workaround that, but there are multiple places that would require that change.

        assert(num_features == 5)  # only support default length of list, and manually reverse the list into tuple of fixed lenth
        return (feature_list[4], feature_list[3], feature_list[2], feature_list[1], feature_list[0])

Given this appears to be an issue with the model not being compatible with torch script, if you have any follow up questions about how best to update the model to make it exportable using torch script it would be better asked on https://github.com/pytorch/pytorch/issues or to the model author.