[BUG] RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _thnn_conv2d_forward

Eslsamu commented 3 years ago

Describe the bug

When calling "historical_forecast" or "predict" on a fittel TCN model trained using GPUs. Happens right after calling the method.

To Reproduce Steps to reproduce the behavior, preferably code snippet.

Error occurs for me both on Ubuntu 18 for the Deep Learning AMI AWS instance with a Tesla T4 GPU and Cuda Version 11.0 and with a similar setup for Google Colab. We are using in both cases python 3.7.10.

Dart has been installed using "pip install 'u8darts[torch]'". torch.cuda.is_available() == True

This bug occurs with my own dataset as well as with the tutorial: https://unit8co.github.io/darts/examples/06-TCN-examples.html

Full Error message: 0% 0/39 [00:00<?, ?it/s]

RuntimeError Traceback (most recent call last)

in 4 forecast_horizon=6, 5 retrain=False, ----> 6 verbose=True) 7 8 ts.plot(label='actual') ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/utils/utils.py in sanitized_method(self, *args, **kwargs) 150 151 getattr(self, sanity_check_method)(*only_args.values(), **only_kwargs) --> 152 return method_to_sanitize(self, *only_args.values(), **only_kwargs) 153 return sanitized_method 154 return decorator ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/models/forecasting_model.py in historical_forecasts(self, series, covariates, num_samples, start, forecast_horizon, stride, retrain, overlap_end, last_points_only, verbose) 293 self._fit_wrapper(series=train, covariates=train_cov) 294 --> 295 forecast = self._predict_wrapper(forecast_horizon, train, covariates, num_samples) 296 297 if last_points_only: ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/models/forecasting_model.py in _predict_wrapper(self, n, series, covariates, num_samples) 719 def _predict_wrapper(self, n: int, series: TimeSeries, covariates: Optional[TimeSeries], 720 num_samples: int) -> TimeSeries: --> 721 return self.predict(n, series, covariates=covariates, num_samples=num_samples) 722 723 def _fit_wrapper(self, series: TimeSeries, covariates: Optional[TimeSeries]): ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/utils/torch.py in decorator(self, *args, **kwargs) 63 with fork_rng(): 64 manual_seed(self._random_instance.randint(0, high=MAX_TORCH_SEED_VALUE)) ---> 65 return decorated(self, *args, **kwargs) 66 return decorator ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/models/torch_forecasting_model.py in predict(self, n, series, covariates, batch_size, verbose, n_jobs, roll_size, num_samples) 519 self.is_recurrent) 520 predictions = self.predict_from_dataset(n, dataset, verbose=verbose, batch_size=batch_size, n_jobs=n_jobs, --> 521 roll_size=roll_size, num_samples=num_samples) 522 return predictions[0] if called_with_single_series else predictions 523 ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/models/torch_forecasting_model.py in predict_from_dataset(self, n, input_series_dataset, batch_size, verbose, n_jobs, roll_size, num_samples) 621 batch_prediction = self._predict_batch_recurrent_model(n, input_series, cov_future) 622 else: --> 623 batch_prediction = self._predict_batch_block_model(n, input_series, cov_future, roll_size) 624 625 # bring predictions into desired format and drop unnecessary values ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/models/torch_forecasting_model.py in _predict_batch_block_model(self, n, input_series, cov_future, roll_size) 650 651 batch_prediction = [] --> 652 out = self._produce_predict_output(input_series)[:, self.first_prediction_index:, :] 653 batch_prediction.append(out[:, :roll_size, :]) 654 prediction_length = roll_size ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/utils/torch.py in decorator(self, *args, **kwargs) 63 with fork_rng(): 64 manual_seed(self._random_instance.randint(0, high=MAX_TORCH_SEED_VALUE)) ---> 65 return decorated(self, *args, **kwargs) 66 return decorator ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/models/tcn_model.py in _produce_predict_output(self, input) 301 return self.likelihood._sample(output) 302 else: --> 303 return self.model(input) 304 305 @property ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 887 result = self._slow_forward(*input, **kwargs) 888 else: --> 889 result = self.forward(*input, **kwargs) 890 for hook in itertools.chain( 891 _global_forward_hooks.values(), ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/models/tcn_model.py in forward(self, x) 199 200 for res_block in self.res_blocks_list: --> 201 x = res_block(x) 202 203 x = x.transpose(1, 2) ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 887 result = self._slow_forward(*input, **kwargs) 888 else: --> 889 result = self.forward(*input, **kwargs) 890 for hook in itertools.chain( 891 _global_forward_hooks.values(), ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/darts/models/tcn_model.py in forward(self, x) 94 left_padding = (self.dilation_base ** self.nr_blocks_below) * (self.kernel_size - 1) 95 x = F.pad(x, (left_padding, 0)) ---> 96 x = self.dropout_fn(F.relu(self.conv1(x))) 97 98 # second step ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 887 result = self._slow_forward(*input, **kwargs) 888 else: --> 889 result = self.forward(*input, **kwargs) 890 for hook in itertools.chain( 891 _global_forward_hooks.values(), ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input) 261 262 def forward(self, input: Tensor) -> Tensor: --> 263 return self._conv_forward(input, self.weight, self.bias) 264 265 ~/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias) 258 _single(0), self.dilation, self.groups) 259 return F.conv1d(input, weight, bias, self.stride, --> 260 self.padding, self.dilation, self.groups) 261 262 def forward(self, input: Tensor) -> Tensor: RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _thnn_conv2d_forward

Eslsamu commented 3 years ago

Same issue happens with RNN model (others I have not tested)

Possible Fix: model = TCNModel(...) model.fit(...) model.model = model.model.to("cpu") model.predict(...)

Eslsamu commented 3 years ago

Still I have not found out how to save and load a model, and then use it with CUDA.

ghost commented 3 years ago

Similar error when attempting to run the following code.

pred_series = model_nbeats.historical_forecasts( series, start=pd.Timestamp('20191010'), forecast_horizon=7, stride=5, retrain=False, verbose=True ) display_forecast(pred_series, series['0'], '7 day', start_date=pd.Timestamp('20191010'))

RuntimeError Traceback (most recent call last)

in ----> 1 pred_series = model_nbeats.historical_forecasts( 2 series, 3 start=pd.Timestamp('20191010'), 4 forecast_horizon=7, 5 stride=5, ~/.local/lib/python3.9/site-packages/darts/utils/utils.py in sanitized_method(self, *args, **kwargs) 150 151 getattr(self, sanity_check_method)(*only_args.values(), **only_kwargs) --> 152 return method_to_sanitize(self, *only_args.values(), **only_kwargs) 153 return sanitized_method 154 return decorator ~/.local/lib/python3.9/site-packages/darts/models/forecasting_model.py in historical_forecasts(self, series, covariates, num_samples, start, forecast_horizon, stride, retrain, overlap_end, last_points_only, verbose) 293 self._fit_wrapper(series=train, covariates=train_cov) 294 --> 295 forecast = self._predict_wrapper(forecast_horizon, train, covariates, num_samples) 296 297 if last_points_only: ~/.local/lib/python3.9/site-packages/darts/models/forecasting_model.py in _predict_wrapper(self, n, series, covariates, num_samples) 719 def _predict_wrapper(self, n: int, series: TimeSeries, covariates: Optional[TimeSeries], 720 num_samples: int) -> TimeSeries: --> 721 return self.predict(n, series, covariates=covariates, num_samples=num_samples) 722 723 def _fit_wrapper(self, series: TimeSeries, covariates: Optional[TimeSeries]): ~/.local/lib/python3.9/site-packages/darts/utils/torch.py in decorator(self, *args, **kwargs) 63 with fork_rng(): 64 manual_seed(self._random_instance.randint(0, high=MAX_TORCH_SEED_VALUE)) ---> 65 return decorated(self, *args, **kwargs) 66 return decorator ~/.local/lib/python3.9/site-packages/darts/models/torch_forecasting_model.py in predict(self, n, series, covariates, batch_size, verbose, n_jobs, roll_size, num_samples) 518 dataset = SimpleInferenceDataset(series, covariates, n, self.input_chunk_length, self.output_chunk_length, 519 self.is_recurrent) --> 520 predictions = self.predict_from_dataset(n, dataset, verbose=verbose, batch_size=batch_size, n_jobs=n_jobs, 521 roll_size=roll_size, num_samples=num_samples) 522 return predictions[0] if called_with_single_series else predictions ~/.local/lib/python3.9/site-packages/darts/models/torch_forecasting_model.py in predict_from_dataset(self, n, input_series_dataset, batch_size, verbose, n_jobs, roll_size, num_samples) 621 batch_prediction = self._predict_batch_recurrent_model(n, input_series, cov_future) 622 else: --> 623 batch_prediction = self._predict_batch_block_model(n, input_series, cov_future, roll_size) 624 625 # bring predictions into desired format and drop unnecessary values ~/.local/lib/python3.9/site-packages/darts/models/torch_forecasting_model.py in _predict_batch_block_model(self, n, input_series, cov_future, roll_size) 650 651 batch_prediction = [] --> 652 out = self._produce_predict_output(input_series)[:, self.first_prediction_index:, :] 653 batch_prediction.append(out[:, :roll_size, :]) 654 prediction_length = roll_size ~/.local/lib/python3.9/site-packages/darts/models/torch_forecasting_model.py in _produce_predict_output(self, input) 803 804 def _produce_predict_output(self, input): --> 805 return self.model(input) 806 807 def _evaluate_validation_loss(self, val_loader: DataLoader): ~/.local/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1050 or _global_forward_hooks or _global_forward_pre_hooks): -> 1051 return forward_call(*input, **kwargs) 1052 # Do not call functions when jit is used 1053 full_backward_hooks, non_full_backward_hooks = [], [] ~/.local/lib/python3.9/site-packages/darts/models/nbeats.py in forward(self, x) 339 for stack in self.stacks_list: 340 # compute stack output --> 341 stack_residual, stack_forecast = stack(x) 342 343 # add stack forecast to final output ~/.local/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1050 or _global_forward_hooks or _global_forward_pre_hooks): -> 1051 return forward_call(*input, **kwargs) 1052 # Do not call functions when jit is used 1053 full_backward_hooks, non_full_backward_hooks = [], [] ~/.local/lib/python3.9/site-packages/darts/models/nbeats.py in forward(self, x) 213 for block in self.blocks_list: 214 # pass input through block --> 215 x_hat, y_hat = block(x) 216 217 # add block forecast to stack forecast ~/.local/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1050 or _global_forward_hooks or _global_forward_pre_hooks): -> 1051 return forward_call(*input, **kwargs) 1052 # Do not call functions when jit is used 1053 full_backward_hooks, non_full_backward_hooks = [], [] ~/.local/lib/python3.9/site-packages/darts/models/nbeats.py in forward(self, x) 133 # fully connected layer stack 134 for layer in self.linear_layer_stack_list: --> 135 x = self.relu(layer(x)) 136 137 # forked linear layers producing waveform generator parameters ~/.local/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1050 or _global_forward_hooks or _global_forward_pre_hooks): -> 1051 return forward_call(*input, **kwargs) 1052 # Do not call functions when jit is used 1053 full_backward_hooks, non_full_backward_hooks = [], [] ~/.local/lib/python3.9/site-packages/torch/nn/modules/linear.py in forward(self, input) 94 95 def forward(self, input: Tensor) -> Tensor: ---> 96 return F.linear(input, self.weight, self.bias) 97 98 def extra_repr(self) -> str: ~/.local/lib/python3.9/site-packages/torch/nn/functional.py in linear(input, weight, bias) 1845 if has_torch_function_variadic(input, weight): 1846 return handle_torch_function(linear, (input, weight), input, weight, bias=bias) -> 1847 return torch._C._nn.linear(input, weight, bias) 1848 1849 RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking arugment for argument mat1 in method wrapper_addmm)

ghost commented 3 years ago

A possible solution from TTS https://github.com/lexkoro/TTS/commit/47d74ced1c0cc462590a51817bbc31617ce97a30

hrzn commented 3 years ago

Could you try to pip install -U darts (updating to 0.9.1) and let us know if the problem persists?

ghost commented 3 years ago

No more errors, thanks!

owoshch commented 3 years ago

I installed the latest darts with all packages from conda and got the same error. I tried to reinstall darts via pip install -U darts but it didn't solve the issue. Do you have any recommendations or ready-to-use conda environments where this bug is solved?

unit8co / darts

[BUG] RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _thnn_conv2d_forward #381