train wrong - Githubissues

zhangyunming commented 5 years ago

when i train my data , it happend as followed:

os@os-l3:/disk3t-2/zym/BiSeNet-PyTorch$ python train.py epoch 0, lr 0.001000: 0%| | 0/4963 [00:00<?, ?it/s]Traceback (most recent call last): File "train.py", line 157, in main(params) File "train.py", line 141, in main train(args, model, optimizer, dataloader_train, dataloader_val, csv_path) File "train.py", line 56, in train output = model(data) File "/home/os/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, kwargs) File "/home/os/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward return self.module(*inputs[0], *kwargs[0]) File "/home/os/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(input, kwargs) File "/disk3t-2/zym/BiSeNet-PyTorch/model/build_BiSeNet.py", line 97, in forward cx1 = self.attention_refinement_module1(cx1) File "/home/os/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/disk3t-2/zym/BiSeNet-PyTorch/model/build_BiSeNet.py", line 40, in forward assert self.in_channels == x.size(1), 'in_channels and out_channels should all be {}'.format(x.size(1)) AssertionError: in_channels and out_channels should all be 256

hubutui commented 5 years ago

Could you try Markdown for formating you output? See https://guides.github.com/features/mastering-markdown/ for detail.

You could check your data, 3 channels or 1 channel?. And set a breakpoint at /disk3t-2/zym/BiSeNet-PyTorch/model/build_BiSeNet.py", line 40, check you tensor's channel.

hubutui commented 5 years ago

Better discuss here in English, so that others could benefit from it. As I mentioned before, you could set a breakpoint at the line before error raise, and change the channels.

JunjieZhouwust commented 5 years ago

I confront this problem now, did you know how to resolve it ? It maybe is relate to the number of GPU

JunjieZhouwust commented 5 years ago

You have to make sure the context_path is 'resnet101'. Because the 1024 channels and 2048 channels is corresponding to resnet101

ooooverflow / BiSeNet

train wrong #3