jdb78 / pytorch-forecasting

Time series forecasting with PyTorch
https://pytorch-forecasting.readthedocs.io/
MIT License
3.87k stars 611 forks source link

DeepAR ONNX Export error #660

Open owoshch opened 3 years ago

owoshch commented 3 years ago

Expected behavior

I executed code https://github.com/jdb78/pytorch-forecasting/blob/master/examples/ar.py

Then I tried to export the model to ONNX: filepath = "model.onnx" input_sample = torch.randn((64,20)) deepar.to_onnx(filepath, input_sample, export_params=True)

I've got the following error:

----> 1 deepar.to_onnx(filepath, input_sample, export_params=True)

~/anaconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, kwargs) 26 def decorate_context(*args, *kwargs): 27 with self.class(): ---> 28 return func(args, kwargs) 29 return cast(F, decorate_context) 30

~/anaconda3/lib/python3.8/site-packages/pytorch_lightning/core/lightning.py in to_onnx(self, file_path, input_sample, kwargs) 1893 if "example_outputs" not in kwargs: 1894 self.eval() -> 1895 kwargs["example_outputs"] = self(input_sample) 1896 1897 torch.onnx.export(self, input_sample, file_path, kwargs)

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, *kwargs) 1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1050 or _global_forward_hooks or _global_forward_pre_hooks): -> 1051 return forward_call(input, **kwargs) 1052 # Do not call functions when jit is used 1053 full_backward_hooks, non_full_backward_hooks = [], []

~/anaconda3/lib/python3.8/site-packages/pytorch_forecasting/models/deepar/init.py in forward(self, x, n_samples) 306 Forward network 307 """ --> 308 hidden_state = self.encode(x) 309 # decode 310 input_vector = self.construct_input_vector(

~/anaconda3/lib/python3.8/site-packages/pytorch_forecasting/models/deepar/init.py in encode(self, x) 228 """ 229 # encode using rnn --> 230 assert x["encoder_lengths"].min() > 0 231 encoder_lengths = x["encoder_lengths"] - 1 232 input_vector = self.construct_input_vector(x["encoder_cat"], x["encoder_cont"])

IndexError: too many indices for tensor of dimension 2

Could you please guide me on how to export the model to ONNX and how to get the size of the input tensor of the model? Thank you!

JoMaCaCha commented 1 year ago

I solved this problem with the TFT model, but I think it is similar with other models. I recommend that you create the input sample using the training or validation dataloader as a base

model.eval()
input_dict = {}
items = next(iter(val_dataloader))[0]
for item in items:
  input_dict[item] = items[item][-1:]
input = (input_dict, {})
#input = (next(iter(val_dataloader))[0], {})
torch.onnx.export(model, input, filepath, verbose=True, export_params=True, operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK)
TaiPhamD commented 1 year ago

I solved this problem with the TFT model, but I think it is similar with other models. I recommend that you create the input sample using the training or validation dataloader as a base

model.eval()
input_dict = {}
items = next(iter(val_dataloader))[0]
for item in items:
  input_dict[item] = items[item][-1:]
input = (input_dict, {})
#input = (next(iter(val_dataloader))[0], {})
torch.onnx.export(model, input, filepath, verbose=True, export_params=True, operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK)

How long does it take you to export the TFT model to onnx ? I've tried copying your code and it seems to take a very long time to run. It's been 8 minutes for me and it's still not done.

Also, have you tried to use the ONNX model by manually generating your input data directly into onnx using numpy without relying on TimeSeriesDataSet data loader?

I am trying to export this model as an ONNX so I can try inference in C++ but having a hard time understanding what type of data is needed and how to shape them properly so I can use it with the exported ONNX model.

JoMaCaCha commented 1 year ago

I am sharing with you the code I use to export the model to ONNX format, to help you to understand how to see the structure of the data and how to do an inference. In my case I use a dataset with 129086 records, 101856 for training, 20228 for validation and 7002 for testing. Exporting the model takes me a few minutes. Maybe, you are using a large dataset or one with a complex structure, that could explain the long time to export your model. Manually generating the input data for a large dataset is very tedious. I recommend you to divide your data into small parts and use the smallest.


!pip install pytorch-forecasting==0.9.2
!pip install pytorch-lightning==1.5.10
!pip install onnxruntime
!pip install onnx

import onnx
import torch
import torch.onnx
import numpy as np
import pandas as pd
import onnxruntime as ort
from pytorch_forecasting.data import TorchNormalizer
from pytorch_forecasting import TemporalFusionTransformer, TimeSeriesDataSet

data = pd.read_csv('path_to_data')

max_prediction_length = 1
max_encoder_length = 9

training_cutoff = int(0.8 * data["Time"].max())
validation_cutoff = training_cutoff + int(0.15 * data["Time"].max())

training = TimeSeriesDataSet(
    data[lambda x: x["Time"] <= training_cutoff],
    time_idx="Time",
    target="Target",
    group_ids=["Group"],
    max_encoder_length=max_encoder_length,
    max_prediction_length=max_prediction_length,
    #time_varying_unknown_reals=["A", "B", "C"],
    time_varying_known_reals=["A", "B", "C"],
    target_normalizer=TorchNormalizer(),
    scalers={"A": TorchNormalizer(), "B": TorchNormalizer(), "C": TorchNormalizer()},
)

validation = TimeSeriesDataSet.from_dataset(
    training,
    data[lambda x: x["Time"] <= validation_cutoff], 
    min_prediction_idx=training_cutoff + 1,
    #predict=True, 
    stop_randomization=True
)

test = TimeSeriesDataSet.from_dataset(
    training, 
    data, 
    min_prediction_idx=validation_cutoff + 1,
    #predict=False, 
    stop_randomization=True
)

batch_size = 96
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=4)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=4)
test_dataloader = test.to_dataloader(train=False, batch_size=batch_size, num_workers=4)

model_path = "model_path"
model = TemporalFusionTransformer.load_from_checkpoint(model_path)

print(model.dataset_parameters["scalers"]["A"].get_parameters())
print(model.dataset_parameters["scalers"]["B"].get_parameters())
print(model.dataset_parameters["scalers"]["C"].get_parameters())

model.eval()
input_dict = {}
items = next(iter(val_dataloader))[0]
for item in items:
  input_dict[item] = items[item][-1:]
input = (input_dict, {})
#input = (next(iter(val_dataloader))[0], {})
torch.onnx.export(
    model=model, 
    args=input, 
    f="onnx_filepath", 
    verbose=True, 
    export_params=True,
    #input_names=["encoder_data", "input_encoder_lengths", "decoder_data", "input_decoder_lengths", "target_scale"],
    output_names=["prediction", "attention", "static_variables", "encoder_variables", "decoder_variables", "output_decoder_lengths", "output_encoder_lengths"],
    opset_version=14
    #operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK
)

# Load the ONNX model
onnx_model = onnx.load("onnx_filepath")

# Check that the IR is well formed
onnx.checker.check_model(onnx_model)

# Print a human readable representation of the graph
onnx.helper.printable_graph(onnx_model.graph)

ort_session = ort.InferenceSession("onnx_filepath")

print("inputs")
inputs = ort_session.get_inputs()
for input in inputs:
  print(input)

print("")
print("outputs")
outputs = ort_session.get_outputs()
for output in outputs:
  print(output)

input = next(iter(test_dataloader))[0]

#print("")
#print("input")
#print(input)

input["encoder_length"] = np.array(input["encoder_lengths"][0]).astype(np.int64)
input["decoder_length"] = np.array(input["decoder_lengths"][0]).astype(np.int64)
input["target_scale"] = [input["target_scale"].numpy().tolist()[0]]
input["x_cont"] = input["encoder_cont"].numpy()[0]

input1 = {}

for key in input:
  if key in ['x_cont', 'encoder_length', 'decoder_length', 'target_scale']:
    if key == "x_cont":
      input1[inputs[0].name] = [input[key][0:9]]
      input1[inputs[2].name] = [input[key][-1:]]
    elif key == "encoder_length":
      input1[inputs[1].name] = [input[key]]
    elif key == "decoder_length":
      input1[inputs[3].name] = [input[key]]
    else:
      input1[item] = input[key]

print("")
print("input1")
print(input1)

prediction = ort_session.run(["prediction"], input1)

print("prediction =", prediction[0][0][0][3])
TaiPhamD commented 1 year ago

@JoMaCaCha thank you so much for that example. It definitely does help. I wish there was a way to save the pre-processing normalization/encoding step in the TimeSeriesDataSet pipeline as another onnx pre-processing like we can do scikit-learn transformer pipeline : http://onnx.ai/sklearn-onnx/auto_examples/plot_complex_pipeline.html.

I am going to try to study the normalization/encoding steps to try to convert it to scikit-learn column transformer pipeline so then I can just concatenate the onnx preprocessing and the main model onnx for use it for inference in production (or could even try to combine the 2 onnx model into 1 model).

vipulsharma94 commented 1 year ago

@TaiPhamD @JoMaCaCha , I am facing issues with converting NHiTs model to onnx format. I get error "RuntimeError: NYI: Named tensors are not supported with the tracer" This is the code I have used - `input_dict = {}

items = next(iter(val_dataloader))[0]

for item in items: input_dict[item] = items[item][-1:]

input = (input_dict, {})

input = (next(iter(val_dataloader))[0], {})

torch.onnx.export(model, input, os.path.join(modelSavePath, "model.onnx"), verbose=True, export_params=True, operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK)`