Errors when running eval and train

athena913 commented 3 years ago

Hi, Thank you for making your code public.

1) I tried running eval.py by using your uploaded models (all_100000.pth) and (all_50000.pth) from the google drive link mentioned under pretrained models (Section 3). But I get the following errors.Looks like the uploaded models are not correct. Could you please provide the link to the correct models? I am not using the docker image.

python eval.py --n_sample 100 --dist pretrained_models/

Traceback (most recent call last): File "eval.py", line 68, in net_ig.load_state_dict(checkpoint['g'])

RuntimeError: Error(s) in loading state_dict for Generator: Missing key(s) in state_dict: "feat_1024.1.weight_orig", "feat_1024.1.weight", "feat_1024.1.weight_u", "feat_1024.1.weight_orig", "feat_1024.1.weight_u", "feat_1024.1.weight_v", "feat_1024.2.weight", "feat_1024.2.bias", "feat_1024.2.running_mean", "feat_1024.2.running_var".

size mismatch for to_big.weight_orig: copying a param with shape torch.Size([3, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 8, 3, 3]). size mismatch for to_big.weight_v: copying a param with shape torch.Size([144]) from checkpoint, the shape in current model is torch.Size([72]).

2) I also tried to train the model, but again encountered errors. Which pytorch version are you using?

File "FastGAN-pytorch/operation.py", line 16, in InfiniteSampler yield order[i] IndexError: index -1 is out of bounds for axis 0 with size 0

Thank you for your help.

odegeasslbc commented 3 years ago

When you try to run the pre-trained model, please use the code that comes with them. Note that every model is configured differently, like different conv layer numbers, model structure, etc. The default code cannot load all pre-trained models. The uploaded models are tested by many people, it is unlikely to have something wrong.
The error is because your image path is wrong, it finds 0 images in the provided folder. Please make sure you pass the image folder as the folder which directly contains the images, and images are in 'jpg' or '.png' format. I'm using PyTorch version 1.6 to 1.8, this error is unlikely to be related to PyTorch version,

athena913 commented 3 years ago

Thank you for your response. 1) I am using the code that comes with the respective models. I downloaded good_art_1k_512.zip which has models/all_50000.pth and good_ffhq_full_512.zip which has models/all_100000.pth. The eval.py in each folder has an iterative loop for evaluating the models as shown below. Since the models/ folder has only one model in each case, I removed the loop and pointed the checkpoint to the model available in the respective models folder and ran eval.py. I still get the key mismatch error reported above. It appears that the model ckpt I downloaded is not consistent with the model architecture. Are good_art_1k_512.zip and good_ffhq_full_512.zip the correct pretrained code/models that we can use for evaluation?

                 for epoch in [10000*i for i in range(args.start_iter, args.end_iter+1)]:
                               ckpt = './models/%d.pth'%(epoch)

odegeasslbc commented 3 years ago

I see, I think the problem is in eval.py, the model is not properly defined and initialized. If you find the lines like this:

net_ig = Generator( ngf=64, nz=noise_dim, nc=3, im_size=args.im_size)#, big=args.big ) net_ig.to(device)

Could you try setting: im_size=512 Because the models were trained at 512 resolution.

athena913 commented 3 years ago

Yes, after changing im_size from the default value of 1024 to 512, I am now able to run eval.py. Thank you.

odegeasslbc / FastGAN-pytorch

Errors when running eval and train #12