[FEAT] Add BiTCN model - Githubissues

Disclaimer: I am the author of this model.

This PR adds BiTCN, a parameter efficient univariate forecasting model with exogenous capabilities based on two temporal convolutional networks. Performance is approximately the same as NHITS, at typically 2 orders of magnitude fewer parameters. Hence, in memory constrained settings this model may be useful for users (or, for example, when only CPU training is available). It also only has two hyperparameters (hidden_size and dropout), making it an easy to use model for our users.

Added BiTCN and AutoBiTCN models to NeuralForecast
Included as second model in Exogenous example (next to NHITS)

Note: there are some changes with respect to the original implementation / paper, mostly for convenience of use in our library:

No embedding layer
Direct forecasts instead of autoregressive forecasts, i.e. added some dense layers at the end;
Removed weight normalization from convolutions.
Removed kernel_size as a hyperparameter, as it's not really necessary to tune, making the model easier to use.

Nixtla / neuralforecast

[FEAT] Add BiTCN model #958