How to train without `weights/epoch_300.pth,`?

zhou-rui1 commented 3 years ago

Hi, the code is awsome, but how can I start my train without weights/epoch_300.pth,? since I have change : 'weight': None, But it seems not work: FileNotFoundError: [Errno 2] No such file or directory: 'weights/epoch_300.pth' With reagrds, any help would be so grateful!

ZM-J commented 3 years ago

Please check readme.md and run utils/get_pretrained_weight_address.py to inspect pretrained weights for encoder (resnet50) and download it. Moreover, you may need to change encoder_weight in args.py to the path.

zhou-rui1 commented 3 years ago

Thanks a lot , but I have the change in args.py already and download the weight 'weight': None, 'dataset': 'DRIVE', Seems something wrong with self.net.load_encoder_weight() and still give me FileNotFoundError: [Errno 2] No such file or directory: 'weights/epoch_300.pth' I think I just need pretrained resnet50.pth as encoder, and do not need this one?

ZM-J commented 3 years ago

I guess you might need to comment the line after 'weight': None? This might be because weight key in args.py has been assigned to value twice. e.g.

a = {1:1, 1:2}
a

it will give you {1:2}.

zhou-rui1 commented 3 years ago

Thanks I try again and it works, but why I got this? Traceback (most recent call last): File "train.py", line 161, in <module> tv = TrainValProcess() File "train.py", line 32, in __init__ self.lr_scheduler = LambdaLR(self.optimizer, lr_lambda=lambda iter: (1 - iter / total_iters) ** ARGS['scheduler_power']) File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 205, in __init__ super(LambdaLR, self).__init__(optimizer, last_epoch, verbose) File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 79, in __init__ self.step() File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 154, in step values = self.get_lr() File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 251, in get_lr for lmbda, base_lr in zip(self.lr_lambdas, self.base_lrs)] File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 251, in <listcomp> for lmbda, base_lr in zip(self.lr_lambdas, self.base_lrs)] File "train.py", line 32, in <lambda> self.lr_scheduler = LambdaLR(self.optimizer, lr_lambda=lambda iter: (1 - iter / total_iters) ** ARGS['scheduler_power']) ZeroDivisionError: division by zero

ZM-J commented 3 years ago

It may be because you set ARGS['batch_size'] > len(dataset)

zhou-rui1 commented 3 years ago

I just set 'batch_size': 2, with images(val=8, train=128) Is it mean my dataset has not loaded in ? With reagrds, so grateful for your help!

ZM-J commented 3 years ago

I think the problem is that total_iters is zero unexpectedly. Here I have no idea why it is. Your dataset has possibly not loaded in just like you said. Please print len(self.train_dataset) in train.py line 32 to check whether it is 0. If it is, please change the dataset code and believe in Chun Ge.

ZM-J commented 3 years ago

If you think that you have further issue about it, please tell me. If not, please consider closing this issue.

ZM-J / ET-Net

How to train without `weights/epoch_300.pth,`? #4