Inference error can not resolve - Adam object has noattribute _step_count

grewe commented 2 years ago

We have trained and are now trying to run inference (using the new corrected config file found in issue #106). However, we are getting the following error. We have traced through the code and can not see anything we can do to resolve this issues. Can you please advise?

We have looked at multiple config files online and none mention this particular parameter as it relates to the Adam optimizer
We have even used rather than our training data the same data used for validation in training --although they both have the exact same structure.

Using random seed 0
cudnn benchmark: True
cudnn deterministic: False
LMDB ROOT ['dataset/val/']
Creating metadata
['images', 'poses-openpose']
Data file extensions: {'images': 'jpg', 'poses-openpose': 'json'}
Searching in dir: images
Found 40 sequences
Found 1524 files
Folder at dataset/val/images opened.
Folder at dataset/val/poses-openpose opened.
Num datasets: 1
Num sequences: 40
Max sequence length: 40
Epoch length: 40
Using random seed 0
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True for input.
    Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True for input.
Concatenate poses-openpose:
    ext: json
    num_channels: 3
    interpolator: None
    normalize: False
    pre_aug_ops: decode_json, convert::imaginaire.utils.visualization.pose::openpose_to_npy
    post_aug_ops: vis::imaginaire.utils.visualization.pose::draw_openpose_npy for input.
    Num. of channels in the input label: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True for input.
    Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True for input.
    Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True for input.
    Num. of channels in the input image: 3
Initialized temporal embedding network with the reference one.
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True for input.
Concatenate poses-openpose:
    ext: json
    num_channels: 3
    interpolator: None
    normalize: False
    pre_aug_ops: decode_json, convert::imaginaire.utils.visualization.pose::openpose_to_npy
    post_aug_ops: vis::imaginaire.utils.visualization.pose::draw_openpose_npy for input.
    Num. of channels in the input label: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True for input.
    Num. of channels in the input image: 3
Initialize net_G and net_D weights using type: xavier gain: 0.02
Using random seed 0
net_G parameter count: 91,147,294
net_D parameter count: 5,598,018
Use custom initialization for the generator.
Opt D Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.0, 0.999)
    eps: 1e-08
    initial_lr: 0.0004
    lr: 0.0004
    weight_decay: 0
)
Setup trainer.
Using automatic mixed precision training.
Augmentation policy: 
GAN mode: hinge
Perceptual loss:
    Mode: vgg19
Loss GAN                  Weight 1.0
Loss FeatureMatching      Weight 10.0
Loss Perceptual           Weight 10.0
Loss Flow                 Weight 10.0
Loss Flow_L1              Weight 10.0
Loss Flow_Warp            Weight 10.0
Loss Flow_Mask            Weight 10.0
Done with loading the checkpoint.
configs for inference
finetune: True
finetune_iter: 100
few_shot_seq_index: 0
few_shot_frame_index: 0
driving_seq_index: 1
output dir:  projects/fs_vid2vid/output/face_forensics
Epoch length: 40
  0% 0/40 [00:00<?, ?it/s]person dict [160.594, 118.24, 0.251877, 192.575, 144.063, 0.824015, 225.618, 146.141, 0.722625, 228.763, 187.411, 0.577983, 222.522, 197.745, 0.0691224, 156.441, 141.995, 0.65878, 136.839, 176.029, 0.816474, 122.349, 203.979, 0.806893, 194.681, 207.035, 0.582138, 219.419, 204.992, 0.580149, 214.285, 253.546, 0.750408, 233.867, 303.093, 0.767982, 172.951, 209.085, 0.521009, 161.562, 260.727, 0.755334, 152.304, 317.576, 0.783057, 0, 0, 0, 161.64, 113.054, 0.485235, 201.904, 119.295, 0.144664, 175.014, 114.108, 0.830086, 109.928, 324.766, 0.654595, 114.095, 326.869, 0.640277, 160.523, 325.834, 0.651386, 214.269, 302.072, 0.6639, 218.372, 302.011, 0.653834, 237.02, 310.333, 0.700539]
person dict [167.8, 86.1641, 0.864743, 120.294, 97.5681, 0.558493, 94.4901, 93.4588, 0.498503, 148.18, 96.528, 0.296182, 163.661, 101.676, 0.0693238, 145.079, 102.756, 0.410478, 200.818, 102.709, 0.455387, 277.273, 103.817, 0.660783, 117.187, 174.002, 0.430313, 99.6447, 175.023, 0.37583, 111.004, 234.891, 0.673355, 102.742, 301.006, 0.74067, 132.683, 174.004, 0.437687, 136.82, 226.63, 0.563235, 128.554, 281.403, 0.770815, 161.654, 76.9165, 0.851908, 169.845, 79.0123, 0.305025, 134.727, 70.7234, 0.842317, 0, 0, 0, 154.368, 287.597, 0.292605, 152.313, 284.509, 0.335124, 121.323, 286.564, 0.595389, 147.154, 313.39, 0.593413, 138.878, 316.507, 0.520831, 94.4743, 307.206, 0.651808]
Training layers:  ['conv_img', 'up', 'weight_generator.fc']
  0% 0/40 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "/content/drive/My Drive/imaginaire/inference.py", line 99, in <module>
    main()
  File "/content/drive/My Drive/imaginaire/inference.py", line 95, in main
    trainer.test(test_data_loader, args.output_dir, cfg.inference_args)
  File "/content/drive/My Drive/imaginaire/imaginaire/trainers/fs_vid2vid.py", line 168, in test
    output = self.test_single(data, output_dir, inference_args)
  File "/content/drive/My Drive/imaginaire/imaginaire/trainers/vid2vid.py", line 376, in test_single
    self.finetune(data, inference_args)
  File "/content/drive/My Drive/imaginaire/imaginaire/trainers/fs_vid2vid.py", line 287, in finetune
    self.gen_update(data_finetune)
  File "/content/drive/My Drive/imaginaire/imaginaire/trainers/vid2vid.py", line 265, in gen_update
    self.get_dis_losses(net_D_output)
  File "/content/drive/My Drive/imaginaire/imaginaire/trainers/vid2vid.py", line 642, in get_dis_losses
    if self.last_step_count_D == self.opt_D._step_count:
**AttributeError: 'Adam' object has no attribute '_step_count'**

grewe commented 2 years ago

@mingyuliutw ---any tips?

grewe commented 2 years ago

@arunmallya still stuck on this --any help much appreciated. We have been through code and can't see anything missing on our end and your test.yaml file you gave us in issue 106 does not explictly list a step_count parameter

grewe commented 2 years ago

We have tried to change the vid2vid.py file to test against self.opt_D.state['step'] but, get then new set of errors. We are using pytorch 1.10

SURABHI-GUPTA commented 2 years ago

@grewe I tried running only the inference part using their pre-trained model on Colab, there is no such issue. The error you reported is during training the model, right?

grewe commented 2 years ago

No it is during inference. We are using pytorch 1.10. Do you mind sharing the config file and telling us what version you are running?

Lynne

On Nov 15, 2021, at 10:00 PM, Surabhi Gupta @.***> wrote:

@grewe https://github.com/grewe I tried running only the inference part using their pre-trained model on Colab, there is no such issue. The error you reported is during training the model, right?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NVlabs/imaginaire/issues/112#issuecomment-969890332, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZ6SXWFWFZVIKJQ4MBGHODUMHXPNANCNFSM5ICQHORQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

SURABHI-GUPTA commented 2 years ago

@grewe Version is same as 1.10.0+cu111.

grewe commented 2 years ago

We’re you running pose or face model? Can you share the config file?

Lynne

On Nov 15, 2021, at 10:19 PM, Surabhi Gupta @.***> wrote:

@grewe https://github.com/grewe Version is same as 1.10.0+cu111.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NVlabs/imaginaire/issues/112#issuecomment-969902252, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZ6SXW2AF2RRXLAOWCQFALUMHZVPANCNFSM5ICQHORQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

SURABHI-GUPTA commented 2 years ago

@grewe I tried testing the face model with their pre-trained. Link to colab: https://colab.research.google.com/drive/1ct5zqfNt_MhG-sV2ZxCSw6TKKYNkb5Ts?usp=sharing Let me know if it helps.

grewe commented 2 years ago

@SURABHI-GUPTA Unfortunately --is different code path and not directly the same --hoping someone from NVidia will respond

grewe commented 2 years ago

@mingyuliutw @arunmallya anyone from your team have a moment to look at this issue --trying to help grad student move forward towards semester completion....

arunmallya commented 2 years ago

We are busy with CVPR until Nov 23 (https://cvpr2022.thecvf.com/submission-timeline). We can only take a look after that.

grewe commented 2 years ago

We are busy with CVPR until Nov 23 (https://cvpr2022.thecvf.com/submission-timeline). We can only take a look after that.

@arunmallya will await your feedback---if you can keep Dikshant and this issue in mind when you free up from CVPR we would appreciate it. Looking forward to seeing your next paper(s)

grewe commented 2 years ago

@arunmallya Can anyone help us now that Thanksgiving is over and the CVPR NOv. 23 deadline is gone?

mingyuliutw commented 2 years ago

@grewe We will not be able to help you debug the issue on your side.

grewe commented 2 years ago

Why if the issue is in your code you are not able to debug this --it is not our issue?

Lynne Grewe | Professor Computer Science, Director iLab | California State University East Bay | @. @.> *

On Mon, Nov 29, 2021 at 5:31 PM Ming-Yu Liu 劉洺堉 @.***> wrote:

@grewe https://github.com/grewe We will not be able to help you debug the issue on your side.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NVlabs/imaginaire/issues/112#issuecomment-982196010, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZ6SXVC4ZVO7GUHA5Y3SPDUOQSOVANCNFSM5ICQHORQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

grewe commented 2 years ago

We have contacted others with similar problem and they gave up and decided not to use NVidia Imaginaire..... Is this the suggestion?

digvijayad commented 2 years ago

@grewe Were you able to solve this?

I also had the same issue, as I'm short on time. I have disabled finetuning in the config file, which skips the error and successfully runs the inference.

NVlabs / imaginaire

Inference error can not resolve - Adam object has noattribute _step_count #112