y_range argument cannot be passed when creating a multi-step TSForecaster

Samcro5C commented 1 year ago

When trying to set the allowed range of the predictions by passing the y_range argument, I noticed that there is no way to set this parameter using tsai API although many models like TSTPlus accept them. Neither TSForecaster nor ts_learner accept y_range as a parameter which makes sense as they both call build_ts_model which also does not accept y_range. The problem is that there is also no workaround by setting a partial custom_head function (e.g. create_mlp_head), since this would require a function that accepts a parameter d which is expected when creating the head in the build_ts_model function (line 160 in models/utils.py).

I think this is a bug, since this parameter y_range cannot be set anywhere although it is part of many models.

oguiza commented 1 year ago

Hi @Samcro5C, I'm not sure how you are using y_range. Here are 2 examples of how y_range can be used:

from tsai.basics import *
ts = get_forecasting_time_series("Sunspots").values
X, y = SlidingWindow(60, horizon=1)(ts)
splits = TimeSplitter(235)(y) 
batch_tfms = TSStandardize()
dls = get_ts_dls(X, y, splits=splits, path='models', batch_tfms=batch_tfms, bs=512)
fcst = ts_learner(dls, arch="TSTPlus", y_range=(0, 300), metrics=mae, cbs=ShowGraph())
fcst.fit_one_cycle(50, 1e-3)
fcst.export("fcst.pkl")

from tsai.basics import *
ts = get_forecasting_time_series("Sunspots").values
X, y = SlidingWindow(60, horizon=1)(ts)
splits = TimeSplitter(235)(y) 
batch_tfms = TSStandardize()
fcst = TSForecaster(X, y, splits=splits, path='models', batch_tfms=batch_tfms, bs=512, arch="TSTPlus", arch_config=dict(y_range=(0, 300)), metrics=mae, cbs=ShowGraph())
fcst.fit_one_cycle(50, 1e-3)
fcst.export("fcst.pkl")

In either case you can confirm the y_range is being applied by printing fcst.head.

Samcro5C commented 1 year ago

Thanks for the fast reply. I tried your code snippet and first of all, I could not download the sunspots dataset (returned HTTP 404) but that is another issue. So I just generated some random data to get the code running and also for this example I do not see any effect of the y_range parameter. At least I thought the head should include a Sigmoid layer which it does not. tsai version 0.3.1 (also holds for 0.3.5)

oguiza commented 1 year ago

The datasets are not stored directly in tsai. They are downloaded from different containers. It's possible (and I've seen it some times) that the server is down for any reason and data cannot be downloaded. I've seen it occasionally. As to the version of tsai I always recommend using the latest version available. It's strange you are seeing this issue with tsai 0.3.5. I've tested it with that version and the output I get when printing the fcst.head is :

Sequential(
  (0): GELU(approximate='none')
  (1): fastai.layers.Flatten(full=False)
  (2): LinBnDrop(
    (0): Linear(in_features=7680, out_features=1, bias=True)
  )
  (3): fastai.layers.SigmoidRange(low=0, high=300)
)

Could you please share the output of running this:

from tsai.imports import my_setup
my_setup()

Samcro5C commented 1 year ago

Okay, I checked again with your example and there was a difference when using random data. Since I want to create multistep forecasts, my y has more than one dimension. This leads to dls.d != None when passed to build_ts_model which leads to the fact that in build_ts_model a custom head is created (as stated in the original post) which is added to kwargs. See also here:

# %% ../../nbs/030_models.utils.ipynb 13
def build_ts_model(arch, c_in=None, c_out=None, seq_len=None, d=None, dls=None, device=None, verbose=False, 
                   pretrained=False, weights_path=None, exclude_head=True, cut=-1, init=None, arch_config={}, **kwargs):

    device = ifnone(device, default_device())
    if dls is not None:
        c_in = ifnone(c_in, dls.vars)
        c_out = ifnone(c_out, dls.c)
        seq_len = ifnone(seq_len, dls.len)
        d = ifnone(d, dls.d)
    if d and not 'patchtst' in arch.__name__.lower(): 
        if 'custom_head' not in kwargs.keys(): 
            if "rocket" in arch.__name__.lower():
                kwargs['custom_head'] = partial(rocket_nd_head, d=d)
            elif "xresnet1d" in arch.__name__.lower():
                kwargs["custom_head"] = partial(xresnet1d_nd_head, d=d)
            else:
                kwargs['custom_head'] = partial(lin_nd_head, d=d)
        elif not isinstance(kwargs['custom_head'], nn.Module):
            kwargs['custom_head'] = partial(kwargs['custom_head'], d=d)
    if 'ltsf_' in arch.__name__.lower() or 'patchtst' in arch.__name__.lower():
        pv(f'arch: {arch.__name__}(c_in={c_in} c_out={c_out} seq_len={seq_len} pred_dim={d} arch_config={arch_config}, kwargs={kwargs})', verbose)
        model = (arch(c_in=c_in, c_out=c_out, seq_len=seq_len, pred_dim=d, **arch_config, **kwargs)).to(device=device)
    elif sum([1 for v in ['RNN_FCN', 'LSTM_FCN', 'RNNPlus', 'LSTMPlus', 'GRUPlus', 'InceptionTime', 'TSiT', 'Sequencer', 'XceptionTimePlus',
                        'GRU_FCN', 'OmniScaleCNN', 'mWDN', 'TST', 'XCM', 'MLP', 'MiniRocket', 'InceptionRocket', 'ResNetPlus', 
                        'RNNAttention', 'LSTMAttention', 'GRUAttention']
            if v in arch.__name__]):
        pv(f'arch: {arch.__name__}(c_in={c_in} c_out={c_out} seq_len={seq_len} arch_config={arch_config} kwargs={kwargs})', verbose)
        model = arch(c_in, c_out, seq_len=seq_len, **arch_config, **kwargs).to(device=device)

So what I intend is to have a y_range on all of the individual forecast predictions, i.e. each lead time. I am not sure if this is not supposed to work but in my opinion there is no reason why not. To reproduce a head without y_range though passed to ts_learner, this snippet will work:

from tsai.basics import *
ts = get_forecasting_time_series("Sunspots").values.astype(np.float32)
X, y = SlidingWindow(60, horizon=3)(ts)
splits = TimeSplitter(235)(y) 
batch_tfms = TSStandardize()
dls = get_ts_dls(X, y, splits=splits, path='models', batch_tfms=batch_tfms, bs=512)
fcst = ts_learner(dls, arch="TSTPlus", y_range=(0, 300), metrics=mae, cbs=ShowGraph())
fcst.head

oguiza commented 1 year ago

Hi @Samcro5C, I understand what you're trying to do. It makes sense. But the actual lin_nd_head doesn't allow passing y_range. It's something that might be added as an enhancement to the current lin_nd_head. Are there any additional arguments you foresee might be needed? I'll see how difficult it is to make this work (will need to update all Plus models). In the meantime, you may want to create your own custom head:

from tsai.basics import *
from tsai.models.layers import *

def lin_nd_head_with_y_range(n_in, n_out, seq_len=None, d=None, flatten=False, use_bn=False, fc_dropout=0., y_range=None):
    layers = [lin_nd_head(n_in, n_out, seq_len=seq_len, d=d, flatten=flatten, use_bn=use_bn, fc_dropout=fc_dropout)]
    if y_range is not None:
        layers += [SigmoidRange(*y_range)]
    return nn.Sequential(*layers)

xb = torch.randn(8, 32, 50)
head = lin_nd_head_with_y_range(32, 1, 50, (20, 3), y_range=(-40, 350)) # remember to always leave some margin
output = head(xb)
output.shape, output.min(), output.max()
# output: (torch.Size([8, 20, 3]), tensor(11.8317, grad_fn=<MinBackward1>), tensor(298.8748, grad_fn=<MaxBackward1>))

You could pass this layer this way:

custom_head = partial(lin_nd_head_with_y_range, y_range=(0, 300))

Note: it's important that any code you use for training must be available at inference time. That means that if you add def lin_nd_head_with_y_range ... to a training script you must also add it to an inference script.

Samcro5C commented 1 year ago

Thank you so much, right now I do not see the need for other arguments for my use-case, but maybe it makes sense to check which arguments there exist for creating heads when using the Plus models. Also thank you very much for the work around you provided, I will check it out and let you know.

timeseriesAI / tsai

y_range argument cannot be passed when creating a multi-step TSForecaster #726