timeseriesAI / tsai

Time series Timeseries Deep Learning Machine Learning Python Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai
https://timeseriesai.github.io/tsai/
Apache License 2.0
5.24k stars 656 forks source link

Failing Tutorial notebooks #55

Closed williamsdoug closed 3 years ago

williamsdoug commented 3 years ago

I've been using the notebooks in tsai/tutorial_nbs/ as a form of tsai installation verification. For each notebook that fails on my local system, I have reproduced the error on Google Colab in both the stable and unstable configuration. Since a number of notebooks fail, I will create a unique reply to this issue for each failing notebook. Note: None of these failing notebooks are blockers for me, but I thought it would be good have these errors logged for future debug purposes.

Colab Configuration - Stable: tsai : 0.2.14 fastai : 2.2.5 fastcore : 1.3.19 torch : 1.7.0+cu101

Colab Configuration - Unstable (master): tsai : 0.2.15 fastai : 2.2.5 fastcore : 1.3.19 torch : 1.7.0+cu101

williamsdoug commented 3 years ago

Failing notebook tsai/tutorial_nbs/00c_Time_Series_data_preparation.ipynb

Fails in cell 43:

Code: X, y = df2xy(df, feat_col='feature', target_col='target', data_cols=slice(2, -1)) splits = get_splits(y, valid_size=.2, stratify=True, random_state=23, shuffle=True) tfms = [None, [Categorize()]] dsets = TSDatasets(X, y, tfms=tfms, splits=splits) dsets

Error Message:

TypeError Traceback (most recent call last)

in () ----> 1 X, y = df2xy(df, feat_col='feature', target_col='target', data_cols=slice(2, -1)) 2 splits = get_splits(y, valid_size=.2, stratify=True, random_state=23, shuffle=True) 3 tfms = [None, [Categorize()]] 4 dsets = TSDatasets(X, y, tfms=tfms, splits=splits) 5 dsets 10 frames /usr/local/lib/python3.6/dist-packages/tsai/data/preparation.py in df2xy(df, feat_col, target_col, data_cols, to3d, splits) 22 if data_cols is None: data_cols = [col for col in df.columns if col not in no_data_cols] 23 n_feats = len(df[feat_col].unique()) if feat_col is not None else 1 ---> 24 data = df.loc[:, data_cols].values 25 _, seq_len = data.shape 26 if to3d: X = data.reshape(n_feats, -1, seq_len).transpose(1, 0, 2) /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __getitem__(self, key) 871 # AttributeError for IntervalTree get_value 872 pass --> 873 return self._getitem_tuple(key) 874 else: 875 # we by definition only have the 0th axis /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_tuple(self, tup) 1053 return self._multi_take(tup) 1054 -> 1055 return self._getitem_tuple_same_dim(tup) 1056 1057 def _get_label(self, label, axis: int): /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_tuple_same_dim(self, tup) 748 continue 749 --> 750 retval = getattr(retval, self.name)._getitem_axis(key, axis=i) 751 # We should never have retval.ndim < self.ndim, as that should 752 # be handled by the _getitem_lowerdim call above. /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis) 1086 if isinstance(key, slice): 1087 self._validate_key(key, axis) -> 1088 return self._get_slice_axis(key, axis=axis) 1089 elif com.is_bool_indexer(key): 1090 return self._getbool_axis(key, axis=axis) /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _get_slice_axis(self, slice_obj, axis) 1121 labels = obj._get_axis(axis) 1122 indexer = labels.slice_indexer( -> 1123 slice_obj.start, slice_obj.stop, slice_obj.step, kind="loc" 1124 ) 1125 /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in slice_indexer(self, start, end, step, kind) 4967 slice(1, 3, None) 4968 """ -> 4969 start_slice, end_slice = self.slice_locs(start, end, step=step, kind=kind) 4970 4971 # return a slice /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in slice_locs(self, start, end, step, kind) 5170 start_slice = None 5171 if start is not None: -> 5172 start_slice = self.get_slice_bound(start, "left", kind) 5173 if start_slice is None: 5174 start_slice = 0 /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_slice_bound(self, label, side, kind) 5080 # For datetime indices label may be a string that has to be converted 5081 # to datetime boundary according to its resolution. -> 5082 label = self._maybe_cast_slice_bound(label, side, kind) 5083 5084 # we need to look up the label /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in _maybe_cast_slice_bound(self, label, side, kind) 5032 # this is rejected (generally .loc gets you here) 5033 elif is_integer(label): -> 5034 self._invalid_indexer("slice", label) 5035 5036 return label /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in _invalid_indexer(self, form, key) 3269 """ 3270 raise TypeError( -> 3271 f"cannot do {form} indexing on {type(self).__name__} with these " 3272 f"indexers [{key}] of type {type(key).__name__}" 3273 ) TypeError: cannot do slice indexing on Index with these indexers [2] of type int
williamsdoug commented 3 years ago

Failing notebook tsai/tutorial_nbs/02_ROCKET_a_new_SOTA_classifier.ipynb

Error in Cell 19

Code: n_tests = 10 _acc = [] for i in range(n_tests): model = create_mlp_head(20_000, dls.c, fc_dropout=0.) model.apply(lin_zero_init) learn = Learner(dls, model, metrics=accuracy) learn.fit_one_cycle(50, lr_max=1e-3) _acc.append(learn.recorder.values[-1][-1]) if i < n_tests -1: clear_output() else: learn.plot_metrics() print(f'accuracy: {np.mean(_acc):.5f} +/- {np.std(_acc):.5f}')

Error:

TypeError Traceback (most recent call last)

in () 2 _acc = [] 3 for i in range(n_tests): ----> 4 model = create_mlp_head(20_000, dls.c, fc_dropout=0.) 5 model.apply(lin_zero_init) 6 learn = Learner(dls, model, metrics=accuracy) TypeError: create_mlp_head() missing 1 required positional argument: 'seq_len'
oguiza commented 3 years ago

Thanks @williamsdoug for taking this initiative! I'm afraid some of the tutorial nbs are a bit out of date, so I'll go ahead and fix these issues as soon as I can. I'm ok with this single issue for all tutorial nbs. I hope it doesn't become too cluttered.

williamsdoug commented 3 years ago

Failing notebook tsai/tutorial_nbs/06_TS_to_image_classification_dev.ipynb

Error in cells 9, 24, 31

Cell 9 Contents: model = create_model(xresnet34, dls=dls, pretrained=True) learn = Learner(dls, model, metrics=accuracy) start = time.time() learn.fit_one_cycle(epochs, lr_max=1e-3) print(f"\ntraining time: {time.strftime('%H:%M:%S', time.gmtime(time.time() - start))}") learn.plot_metrics()

Error:

AssertionError Traceback (most recent call last)

in () ----> 1 model = create_model(xresnet34, dls=dls, pretrained=True) 2 learn = Learner(dls, model, metrics=accuracy) 3 start = time.time() 4 learn.fit_one_cycle(epochs, lr_max=1e-3) 5 print(f"\ntraining time: {time.strftime('%H:%M:%S', time.gmtime(time.time() - start))}") /usr/local/lib/python3.6/dist-packages/tsai/models/utils.py in build_ts_model(arch, c_in, c_out, seq_len, d, dls, device, verbose, pretrained, weights_path, exclude_head, **kwargs) 126 127 if pretrained: --> 128 assert weights_path is not None, "you need to pass a valid weights_path to use a pre-trained model" 129 transfer_weights(model, weights_path, exclude_head=exclude_head, device=device) 130 return model AssertionError: you need to pass a valid weights_path to use a pre-trained model
williamsdoug commented 3 years ago

Failing notebook tsai/tutorial_nbs/08_Self_Supervised_TSBERT.ipynb

Error in cells 9, 10

Cell 9 contents: learn.TSBERT.show_preds(sharey=True)

Error:

NameError Traceback (most recent call last)

in () ----> 1 learn.TSBERT.show_preds(sharey=True) /usr/local/lib/python3.6/dist-packages/tsai/callback/TSBERT.py in show_preds(self, max_n, nrows, ncols, figsize, sharex, **kwargs) 140 bs, nvars, seq_len = xb.shape 141 self.learn('before_batch') --> 142 pred = learn.model(*self.learn.xb).detach().cpu().numpy() 143 mask = self.mask.cpu().numpy() 144 masked_pred = np.ma.masked_where(mask, pred) NameError: name 'learn' is not defined
williamsdoug commented 3 years ago

tsai/tutorial_nbs03_Time_Series_Transforms.ipynb: This notebook only fails on my local system, but runs correctly on Google Colab. Most likely some form of complex package compatibility issue. Including in case the fix is obvious or if others encounter similar errors:

Local configuration (fails): tsai : 0.2.14 fastai : 2.2.5 fastcore : 1.3.19 torch : 1.7.0 tmatplotlib: 3.3.2

Colab configuration (works): tsai : 0.2.14 fastai : 2.2.5 fastcore : 1.3.19 torch : 1.7.0+cu101 matplotlib : 3.2.2

Failing cells: 4, 6, 10

Cell 4 contents: for i in range(100): plt.plot(TSTimeNoise(.5)(xb, split_idx=0)[0].T, color='gainsboro', alpha=.1) plt.plot(xb[0].T) plt.show()

Error Message:

TypeError Traceback (most recent call last)

in ----> 1 for i in range(100): plt.plot(TSTimeNoise(.5)(xb, split_idx=0)[0].T, color='gainsboro', alpha=.1) 2 plt.plot(xb[0].T) 3 plt.show() ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/matplotlib/pyplot.py in plot(scalex, scaley, data, *args, **kwargs) 2840 return gca().plot( 2841 *args, scalex=scalex, scaley=scaley, -> 2842 **({"data": data} if data is not None else {}), **kwargs) 2843 2844 ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/matplotlib/axes/_axes.py in plot(self, scalex, scaley, data, *args, **kwargs) 1743 lines = [*self._get_lines(*args, data=data, **kwargs)] 1744 for line in lines: -> 1745 self.add_line(line) 1746 self._request_autoscale_view(scalex=scalex, scaley=scaley) 1747 return lines ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/matplotlib/axes/_base.py in add_line(self, line) 1962 line.set_clip_path(self.patch) 1963 -> 1964 self._update_line_limits(line) 1965 if not line.get_label(): 1966 line.set_label('_line%d' % len(self.lines)) ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/matplotlib/axes/_base.py in _update_line_limits(self, line) 1984 Figures out the data limit of the given line, updating self.dataLim. 1985 """ -> 1986 path = line.get_path() 1987 if path.vertices.size == 0: 1988 return ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/matplotlib/lines.py in get_path(self) 1009 """ 1010 if self._invalidy or self._invalidx: -> 1011 self.recache() 1012 return self._path 1013 ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/matplotlib/lines.py in recache(self, always) 656 if always or self._invalidy: 657 yconv = self.convert_yunits(self._yorig) --> 658 y = _to_unmasked_float_array(yconv).ravel() 659 else: 660 y = self._y ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/matplotlib/cbook/__init__.py in _to_unmasked_float_array(x) 1287 return np.ma.asarray(x, float).filled(np.nan) 1288 else: -> 1289 return np.asarray(x, float) 1290 1291 ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order) 81 82 """ ---> 83 return array(a, dtype, copy=False, order=order) 84 85 ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/torch/tensor.py in __array__(self, dtype) 626 from torch.overrides import has_torch_function, handle_torch_function 627 if type(self) is not Tensor and has_torch_function(relevant_args): --> 628 return handle_torch_function(Tensor.__array__, relevant_args, self, dtype=dtype) 629 if dtype is None: 630 return self.numpy() ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/torch/overrides.py in handle_torch_function(public_api, relevant_args, *args, **kwargs) 1061 # Use `public_api` instead of `implementation` so __torch_function__ 1062 # implementations can do equality/identity comparisons. -> 1063 result = overloaded_arg.__torch_function__(public_api, types, args, kwargs) 1064 1065 if result is not NotImplemented: ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/fastai/torch_core.py in __torch_function__(self, func, types, args, kwargs) 323 convert=False 324 if _torch_handled(args, self._opt, func): convert,types = type(self),(torch.Tensor,) --> 325 res = super().__torch_function__(func, types, args=args, kwargs=kwargs) 326 if convert: res = convert(res) 327 if isinstance(res, TensorBase): res.set_meta(self, as_copy=True) ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/torch/tensor.py in __torch_function__(cls, func, types, args, kwargs) 993 994 with _C.DisableTorchFunction(): --> 995 ret = func(*args, **kwargs) 996 return _convert(ret, cls) 997 ~/anaconda3/envs/tsai2_ref/lib/python3.7/site-packages/torch/tensor.py in __array__(self, dtype) 630 return self.numpy() 631 else: --> 632 return self.numpy().astype(dtype, copy=False) 633 634 # Wrap Numpy array again in a suitable tensor when done, to support e.g. TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
williamsdoug commented 3 years ago

In the case of notebook tsai/tutorial_nbs03_Time_Series_Transforms.ipynb, the error can be avoided by inserting 'xb = xb.cpu()' at the start of cell 4

Revised cell 4 contents: xb = xb.cpu() for i in range(100): plt.plot(TSTimeNoise(.5)(xb, split_idx=0)[0].T, color='gainsboro', alpha=.1) plt.plot(xb[0].T) plt.show()

oguiza commented 3 years ago

I believe all of the issues reported above are now fixed using the following configuration:

Colab Configuration: tsai : 0.2.15 fastai : 2.2.5 fastcore : 1.3.19 torch : 1.7.0+cu101

I've released a new version in pip (0.2.15).

oguiza commented 3 years ago

I’ll close this issue now, but feel free to reopen it if necessary.

williamsdoug commented 3 years ago

Reran all tutorial notebooks. 06_TS_to_image_classification fails, but all others now work. Will open separate issue.