In the above tutorial, when I replace cpn='CpnU22', in conf = cd.Config(....)
After 1 epoch of training, on the second epoch I get the following error :
Epoch 2/100 - loss 12.061: 56%|███████████████████████████████████████████████████▍ | 286/512 [02:23<01:54, 1.98it/s]
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [55,0,0], thread: [64,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "ind
ex out of bounds"` failed.
...
...
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [85,0,0], thread: [94,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [85,0,0], thread: [95,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
...
...
...
Traceback (most recent call last):
File "train2.py", line 231, in <module>
train_epoch(model, train_loader, conf.device, optimizer, f'Epoch {epoch}/{conf.epochs}', scaler, scheduler)
File "train2.py", line 212, in train_epoch
outputs: dict = model(batch['inputs'], targets=batch)
File "/home/ppriyank/anaconda3/envs/pathak/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ppriyank/covid_cell/celldetection/celldetection/models/cpn.py", line 441, in forward
buckets = resolve_refinement_buckets(sampling, self.core.refinement_buckets)
File "/home/ppriyank/covid_cell/celldetection/celldetection/ops/cpn.py", line 203, in resolve_refinement_buckets
(a % num_buckets, refinement_bucket_weight(a, base_index)),
File "/home/ppriyank/covid_cell/celldetection/celldetection/ops/cpn.py", line 193, in refinement_bucket_weight
dist[sel] = 0
RuntimeError: CUDA error: device-side assert triggered
https://github.com/FZJ-INM1-BDA/celldetection/blob/main/demos/Cell%20Detection%20with%20Contour%20Proposal%20Networks.ipynb
In the above tutorial, when I replace
cpn='CpnU22',
in conf = cd.Config(....) After 1 epoch of training, on the second epoch I get the following error :