timeseriesAI / tsai

Time series Timeseries Deep Learning Machine Learning Python Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai
https://timeseriesai.github.io/tsai/
Apache License 2.0
4.91k stars 622 forks source link

fix for issue #847, TSMultiLabelClassification model head shape auto … #855

Closed cversek closed 7 months ago

cversek commented 8 months ago

…configuration

@oguiza FIXES BUG introduced in commit 9caff8f by reverting to previous code.

This bug was found by using git bisect to hunt down the offending commit between tags 0.3.5 (good) and 0.3.6 (bad).

Now since I don't understand the subtleties of why you made this change in the first place, I'm submitting this as a draft PR.

review-notebook-app[bot] commented 8 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

cversek commented 7 months ago

As I understand so far, this PR breaks other tests that are run by nbdev_prepare. Specifically:

AssertionError in /home/cversek/gitwork/cversek/tsai/nbs/006_data.core.ipynb:
===========================================================================

While Executing Cell #104:
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[1], line 23
     21 dls.decoder(yb)
     22 dls.decoder(yb[0])
---> 23 test_eq((dls.cat, dls.c), (True, 5))
     24 test_ne(dls.cws.cpu().numpy(), None)

File ~/mambaforge/envs/neurovep_data/lib/python3.11/site-packages/fastcore/test.py:37, in test_eq(a, b)
     35 def test_eq(a,b):
     36     "`test` that `a==b`"
---> 37     test(a,b,equals, cname='==')

File ~/mambaforge/envs/neurovep_data/lib/python3.11/site-packages/fastcore/test.py:27, in test(a, b, cmp, cname)
     25 "`assert` that `cmp(a,b)`; display inputs and `cname or cmp.__name__` if it fails"
     26 if cname is None: cname=cmp.__name__
---> 27 assert cmp(a,b),f"{cname}:\n{a}\n{b}"

AssertionError: ==:
(True, 1)
(True, 5)

/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [1,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [3,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [4,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [5,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [6,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [7,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [8,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [9,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [12,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [13,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [16,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [17,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [18,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [19,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [20,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [21,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [22,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [23,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [24,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [27,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [29,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1682343995622/work/aten/src/ATen/native/cuda/Loss.cu:240: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [30,0,0] Assertion `t >= 0 && t < n_classes` failed.

and

AssertionError in /home/cversek/gitwork/cversek/tsai/nbs/057_models.MINIROCKETPlus_Pytorch.ipynb:
===========================================================================

While Executing Cell #15:
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[1], line 10
      8 model = MiniRocketPlus(dls.vars, dls.c, dls.len, custom_head=custom_head)
      9 xb,yb = dls.one_batch()
---> 10 test_eq(model.to(xb.device)(xb).shape[1:], y.shape[1:]+(4,))

File ~/mambaforge/envs/neurovep_data/lib/python3.11/site-packages/fastcore/test.py:37, in test_eq(a, b)
     35 def test_eq(a,b):
     36     "`test` that `a==b`"
---> 37     test(a,b,equals, cname='==')

File ~/mambaforge/envs/neurovep_data/lib/python3.11/site-packages/fastcore/test.py:27, in test(a, b, cmp, cname)
     25 "`assert` that `cmp(a,b)`; display inputs and `cname or cmp.__name__` if it fails"
     26 if cname is None: cname=cmp.__name__
---> 27 assert cmp(a,b),f"{cname}:\n{a}\n{b}"

AssertionError: ==:
torch.Size([1, 100])
(1, 100, 4)

...and a few others that I assume are due to the same reason. So, I will go back to the commit 9caff8f and see if I can craft some logic that passes these tests while also unbreaking the TSMulitLabelClassification model.

cversek commented 7 months ago

OK, so we had a false start. While changing the code around commit 9caff8f did get the TSMultiLabelClassification to build a compatible shape model, this changes broke a lot of other stuff, so it was reverted.

cversek commented 7 months ago

I'm closing this pull request because I don't yet have a satisfactory solution to the issue #847, see ongoing discussion.