mlverse / torchvision

R interface to torchvision
https://torchvision.mlverse.org
Other
62 stars 15 forks source link

cannot transform_to_tensor when images in a batch are of different sizes #13

Closed skeydan closed 4 years ago

skeydan commented 4 years ago

the error occurs in torch_stack:

 Error in (function (tensors, dim)  : 
  stack expects each tensor to be equal size, but got [3, 224, 224] at entry 0 and [3, 504, 481] at entry 10 (get_stack_inputs at ../aten/src/ATen/native/TensorShape.cpp:961)

I think the underlying problem may be that when we have a list of transforms, we convert to tensors first, then operate on tensors; while Python does transformations like resizing/cropping first, and only then converts to tensors.

I wonder how this is affected by the plan to move away from magick?

thanks!

skeydan commented 4 years ago

Here is a related problem (related in the sense that I assume it might not appear if the order of transforms was "the other way round"):

Error in (function (self, output_size, align_corners, scales_h, scales_w)  : 
  Input and output sizes should be greater than 0, but got input (H: 210, W: 0) output (H: 224, W: 224) (upsample_2d_shape_check at ../aten/src/ATen/native/UpSample.h:108)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x6a (0x7f12607baaaa in /home/key/libtorch/lib/libc10.so)
frame #1: <unknown function> + 0xf7a080 (0x7f124b0ee080 in /home/key/libtorch/lib/libtorch_cpu.so)
frame #2: <unknown function> + 0xf7a436 (0x7f124b0ee436 in /home/key/libtorch/lib/libtorch_cpu.so)
frame #3: at::native::upsample_bilinear2d_cpu(at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double>) + 0x1cc (0x7f124b0f502c in /home/key/libtorch/lib/libtorch_cpu.so)
frame #4: <unknown function> + 0x119f781 (0x7f124b313781 in /home/key/libtorch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0x11b2c68 (0x7f124b326c68 in /home/key/libtorch/lib/libtorch_cpu.so)
frame #6: <unknown f 
36.
stop(structure(list(message = "Input and output sizes should be greater than 0, but got input (H: 210, W: 0) output (H: 224, W: 224) (upsample_2d_shape_check at ../aten/src/ATen/native/UpSample.h:108)\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x6a (0x7f12607baaaa in /home/key/libtorch/lib/libc10.so)\nframe #1: <unknown function> + 0xf7a080 (0x7f124b0ee080 in /home/key/libtorch/lib/libtorch_cpu.so)\nframe #2: <unknown function> + 0xf7a436 (0x7f124b0ee436 in /home/key/libtorch/lib/libtorch_cpu.so)\nframe #3: at::native::upsample_bilinear2d_cpu(at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double>) + 0x1cc (0x7f124b0f502c in /home/key/libtorch/lib/libtorch_cpu.so)\nframe #4: <unknown function> + 0x119f781 (0x7f124b313781 in /home/key/libtorch/lib/libtorch_cpu.so)\nframe #5: <unknown function> + 0x11b2c68 (0x7f124b326c68 in /home/key/libtorch/lib/libtorch_cpu.so)\nframe #6: <unknown function> + 0x2e8df32 (0x7f124d001f32 in /home/key/libtorch/lib/libtorch_cpu.so)\nframe #7: <unknown function> + 0x11b2c68 (0x7f124b326c68 in /home/key/libtorch/lib/libtorch_cpu.so)\nframe #8: at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double> >(c10::OperatorHandle const&, at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double>) const + 0x133 (0x7f1260fa9b79 in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/deps/liblantern.so)\nframe #9: at::Tensor c10::Dispatcher::callUnboxedWithDispatchKey<at::Tensor, at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double> >(c10::OperatorHandle const&, c10::DispatchKey, at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double>) const + 0x118 (0x7f1260f46a16 in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/deps/liblantern.so)\nframe #10: at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double> >(c10::OperatorHandle const&, at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double>) const + 0xfd (0x7f1260f19041 in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/deps/liblantern.so)\nframe #11: at::Tensor c10::OperatorHandle::callUnboxed<at::Tensor, at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double> >(at::Tensor const&, c10::ArrayRef<long>, bool, c10::optional<double>, c10::optional<double>) const + 0xd5 (0x7f1260ef7a25 in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/deps/liblantern.so)\nframe #12: <unknown function> + 0x22e099 (0x7f1260c0a099 in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/deps/liblantern.so)\nframe #13: _lantern_upsample_bilinear2d_tensor_intarrayref_bool_double_double + 0x1c8 (0x7f1260e8af3f in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/deps/liblantern.so)\nframe #14: lantern_upsample_bilinear2d_tensor_intarrayref_bool_double_double + 0x47 (0x7f12617d8d97 in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/libs/torchpkg.so)\nframe #15: cpp_torch_namespace_upsample_bilinear2d_self_Tensor_output_size_IntArrayRef_align_corners_bool(Rcpp::XPtr<XPtrTorchTensor, Rcpp::PreserveStorage, &(void Rcpp::standard_delete_finalizer<XPtrTorchTensor>(XPtrTorchTensor*)), false>, std::vector<long, std::allocator<long> >, bool, nullable<double>, nullable<double>) + 0x171 (0x7f12617afb43 in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/libs/torchpkg.so)\nframe #16: _torch_cpp_torch_namespace_upsample_bilinear2d_self_Tensor_output_size_IntArrayRef_align_corners_bool + 0x22f (0x7f126165416a in /home/key/R/x86_64-redhat-linux-gnu-library/4.0/torch/libs/torchpkg.so)\nframe #17: <unknown function> + 0x101558 (0x7f1291a7b558 in /usr/lib64/R/lib/libR.so)\nframe #18: <unknown function> + 0x101af5 (0x7f1291a7baf5 in /usr/lib64/R/lib/libR.so)\nframe #19: <unknown function> + 0x13dd0d (0x7f1291ab7d0d in /usr/lib64/R/lib/libR.so)\nframe #20: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #21: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #22: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #23: Rf_eval + 0x2af (0x7f1291ad1cdf in /usr/lib64/R/lib/libR.so)\nframe #24: <unknown function> + 0xccbaf (0x7f1291a46baf in /usr/lib64/R/lib/libR.so)\nframe #25: <unknown function> + 0x13dd0d (0x7f1291ab7d0d in /usr/lib64/R/lib/libR.so)\nframe #26: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #27: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #28: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #29: <unknown function> + 0x146150 (0x7f1291ac0150 in /usr/lib64/R/lib/libR.so)\nframe #30: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #31: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #32: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #33: <unknown function> + 0x146150 (0x7f1291ac0150 in /usr/lib64/R/lib/libR.so)\nframe #34: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #35: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #36: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #37: <unknown function> + 0x146150 (0x7f1291ac0150 in /usr/lib64/R/lib/libR.so)\nframe #38: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #39: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #40: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #41: <unknown function> + 0x146150 (0x7f1291ac0150 in /usr/lib64/R/lib/libR.so)\nframe #42: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #43: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #44: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #45: <unknown function> + 0x146150 (0x7f1291ac0150 in /usr/lib64/R/lib/libR.so)\nframe #46: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #47: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #48: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #49: <unknown function> + 0x19db71 (0x7f1291b17b71 in /usr/lib64/R/lib/libR.so)\nframe #50: <unknown function> + 0x19dfb1 (0x7f1291b17fb1 in /usr/lib64/R/lib/libR.so)\nframe #51: <unknown function> + 0x19e377 (0x7f1291b18377 in /usr/lib64/R/lib/libR.so)\nframe #52: <unknown function> + 0x13ce80 (0x7f1291ab6e80 in /usr/lib64/R/lib/libR.so)\nframe #53: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #54: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #55: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #56: <unknown function> + 0x146150 (0x7f1291ac0150 in /usr/lib64/R/lib/libR.so)\nframe #57: Rf_eval + 0x88 (0x7f1291ad1ab8 in /usr/lib64/R/lib/libR.so)\nframe #58: <unknown function> + 0x15978e (0x7f1291ad378e in /usr/lib64/R/lib/libR.so)\nframe #59: Rf_applyClosure + 0x1a2 (0x7f1291ad45f2 in /usr/lib64/R/lib/libR.so)\nframe #60: <unknown function> + 0x19db71 (0x7f1291b17b71 in /usr/lib64/R/lib/libR.so)\nframe #61: <unknown function> + 0x19df47 (0x7f1291b17f47 in /usr/lib64/R/lib/libR.so)\nframe #62: <unknown function> + 0x19e377 (0x7f1291b18377 in /usr/lib64/R/lib/libR.so)\nframe #63: <unknown function> + 0x13ce80 (0x7f1291ab6e80 in /usr/lib64/R/lib/libR.so)\n", 
    call = (function (self, output_size, align_corners, scales_h, 
        scales_w) 
    { ... at RcppExports.R#6609
35.
(function (self, output_size, align_corners, scales_h, scales_w) 
{
    .Call("_torch_cpp_torch_namespace_upsample_bilinear2d_self_Tensor_output_size_IntArrayRef_align_corners_bool", 
        PACKAGE = "torchpkg", self, output_size, align_corners,  ... 
34.
do.call(fun, args) at codegen-utils.R#186
33.
do_call(f, args_t[[1]]) at codegen-utils.R#244
32.
call_c_function(fun_name = "upsample_bilinear2d", args = args, 
    expected_types = expected_types, nd_args = nd_args, return_types = return_types, 
    fun_type = "namespace") at gen-namespace.R#16693
31.
torch_upsample_bilinear2d(input, sze, align_corners, sfl[[1]], 
    sfl[[2]]) at nnf-upsampling.R#169
30.
torch::nnf_interpolate(img, size = c(size_h, size_w), mode = mode, 
    align_corners = align_corners) at transforms-tensor.R#150
29.
transform_resize.torch_tensor(img, size, interpolation) at transforms-generics.R#79
28.
transform_resize(img, size, interpolation) at transforms-defaults.R#506
27.
transform_resized_crop.default(img, params[1], params[2], params[3], 
    params[4], size, interpolation) at transforms-generics.R#474
26.
transform_resized_crop(img, params[1], params[2], params[3], 
    params[4], size, interpolation) at transforms-defaults.R#202
25.
transform_random_resized_crop.default(., size = c(224, 224)) at transforms-generics.R#238
24.
transform_random_resized_crop(., size = c(224, 224)) 
23.
function_list[[i]](value) 
22.
freduce(value, `_function_list`) 
21.
`_fseq`(`_lhs`) 
20.
eval(quote(`_fseq`(`_lhs`)), env, env) 
19.
eval(quote(`_fseq`(`_lhs`)), env, env) 
18.
withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)) 
17.
img %>% transform_to_tensor() %>% transform_random_resized_crop(size = c(224, 
    224)) %>% transform_color_jitter() %>% transform_random_horizontal_flip() %>% 
    transform_normalize(mean = c(0.485, 0.456, 0.406), std = c(0.229, 
        0.224, 0.225)) 
16.
self$transform(sample) at folder-dataset.R#96
15.
x$.getitem(y) at utils-data.R#88
14.
`[.dataset`(dataset, possibly_batched_index[[i]]) at utils-data-fetcher.R#30
13.
dataset[possibly_batched_index[[i]]] at utils-data-fetcher.R#30
12.
self$.dataset_fetcher$fetch(index) at utils-data-dataloader.R#236
11.
self$.next_data() at utils-data-dataloader.R#203
10.
parent.env(x)$.iter$.next() at utils-data-enum.R#16
9.
`[[.enum_env`(b, 1) 
8.
b[[1]] 
7.
mget(x = c("input", "weight", "bias", "stride", "padding", "dilation", 
    "groups")) at gen-namespace.R#5112
6.
torch_conv2d(input = input, weight = weight, bias = bias, stride = stride, 
    padding = padding, dilation = dilation, groups = groups) at nnf-conv.R#47
5.
nnf_conv2d(input, weight, self$bias, self$stride, self$padding, 
    self$dilation, self$groups) at nn-conv.R#327
4.
self$conv_forward_(input, self$weight) at nn-conv.R#332
3.
self$conv1(x) at models-resnet.R#223
2.
model(b[[1]]$to(device = "cuda")) 
1.
find_lr() 

>
skeydan commented 4 years ago

https://github.com/mlverse/torchvision/pull/14