Open IcewineChen opened 5 years ago
I have the same issue. Have you solve this problem?
I am also seeing a similar issue for one of my experiments. Can you give the pointers on how you overcame it @IcewineChen .
Regards
@uvaidya @IcewineChen can you check if this still reproduces on 1.0.0 stable, or on pytorch nightly build? I believe we fixed this now.
@uvaidya @IcewineChen can you check if this still reproduces on 1.0.0 stable, or on pytorch nightly build? I believe we fixed this now.
@soumith Sorry for interrupting you. I have tested, but the program still get an error . I'm sure I use cuda() to move the tensor to gpu device, and the module has been moved to gpu mode, too. But when I use pytorch-nightly dev11.28 version and pytorch1.0 stable, the error looks like that:
terminate called after throwing an instance of 'c10::Error'
what(): expected type CUDAFloatType but got CPUFloatType (compute_types at/pytorch/aten/src/ATen/native/TensorIterator.cpp:134)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7ff6e60f5d31 in /home/chr/action-sdk/libs/libtorch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7ff6e60f55fa in /home/chr/action-sdk/libs/libtorch/lib/libc10.so)
frame #2: at::TensorIterator::compute_types() + 0x3b5 (0x7ff7081aa055 in /home/chr/action-sdk/libs/libtorch/lib/libcaffe2.so)
frame #3: at::TensorIterator::Builder::build() + 0x46 (0x7ff7081abcc6 in /home/chr/action-sdk/libs/libtorch/lib/libcaffe2.so)
frame #4: at::TensorIterator::binary_op(at::Tensor&, at::Tensor const&, at::Tensor const&) + 0x2c4 (0x7ff7081ac634 in /home/chr/action-sdk/libs/libtorch/lib/libcaffe2.so)
frame #5: at::native::addout(at::Tensor&, at::Tensor const&, at::Tensor const&, c10::Scalar) + 0x77 (0x7ff708099717 in /home/chr/action-sdk/libs/libtorch/lib/libcaffe2.so)
frame #6: at::TypeDefault::add(at::Tensor&, at::Tensor const&, c10::Scalar) const + 0x68 (0x7ff70839f198 in /home/chr/action-sdk/libs/libtorch/lib/libcaffe2.so)
frame #7: torch::autograd::VariableType::add_(at::Tensor&, at::Tensor const&, c10::Scalar) const + 0x1d6 (0x7ff718e6b7d6 in /home/chr/action-sdk/libs/libtorch/lib/libtorch.so.1)
frame #8:
[1] 15839 abort (core dumped) ./action ~/experiment/video-classification-3d-cnn-pytorch/resnet34-ucf101.pt
And there is my trace code writting by python:
example = torch.rand(size=(1, 3, 64, 112, 112))
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("resnet34-ucf101.pt")
Could you give me some advice?
i met the same problem, have you solved it?@IcewineChen
@Sierkinhane @IcewineChen Have you solved?
π Bug
Thanks for your teams' great work! But during using the C++ API of pytorch on gpu, there are some confusing bugs. When I try to load a .pt file as module and then do a forward operator, I got an exception.
To Reproduce
Here are my code and exception, the .pt file is generated by torch.jit.trace(model, example).cuda()
my code:
the state of variable: I'have checked the tensor that be pushed to inputs vector is Variable[CUDAFloatType], and the model.pt is generated on cuda.
I have read the source code and find that the construct function of module->forward() only accept vector type. But the vector type can't match the check by the tensor lib. Could you give me some advice and help about how to change the vector type match the check by ATen? Thank you very much.
Environment
conda
,pip
, source): conda