yoyo-nb / Thin-Plate-Spline-Motion-Model

[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.
MIT License
3.39k stars 555 forks source link

error when trying to train #40

Closed surfingnirvana closed 1 year ago

surfingnirvana commented 1 year ago

\Thin-Plate-Spline-Motion-Model\train.py", line 93, in train logger.log_epoch(epoch, model_save, inp=x, out=generated) UnboundLocalError: local variable 'x' referenced before assignment

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\NEURAL\Thin-Plate-Spline-Motion-Model\run.py", line 83, in train(config, inpainting, kp_detector, bg_predictor, dense_motion_network, opt.checkpoint, log_dir, dataset) File "C:\NEURAL\Thin-Plate-Spline-Motion-Model\train.py", line 93, in train logger.log_epoch(epoch, model_save, inp=x, out=generated) TypeError: exit() takes 1 positional argument but 4 were given

yoyo-nb commented 1 year ago

It is possible that the dataset failed to load, causing this for loop not to be run.

surfingnirvana commented 1 year ago

The videos have mp4 format. I had the same error with the repository https://github.com/snap-research/articulated-animation/ Training articulated-animation in standard training mode with png and mp4 files was successful. When trying to train in avd mode mp4 files did not work. pngs were working fine. Now in thin plate model neither png nor mp4 do not work.

I had to change this line in run.py because of an error:

with open(opt.config) as f:
    config = yaml.load(f)

to

with open(opt.config) as f:
    config = yaml.load(f, Loader=yaml.FullLoader)
yoyo-nb commented 1 year ago
  1. you can check that the dataset path in the config file is correct.
  2. I use the same dataset loading script as mraa and the code should be fine
  3. MRAA's avd mode uses a larger batchsize (256). mp4 mode loads a large amount of video into memory, which can lead to memory overflow, while png mode loads only two frames of images for each sample, so it is better to use PNG mode.
surfingnirvana commented 1 year ago

It appears to have this error when i change the num_repeats = 150 to 2 in config file. UnboundLocalError: local variable 'x' referenced before assignment

yoyo-nb commented 1 year ago

Are you using your own dataset with a small sample size, if so, change this line to drop_last=False.

surfingnirvana commented 1 year ago

Ok, this is fixed. I am using a small sample size. But it is still using the memory to the max. Is there any way i could limit this behaviour? For standard training i set batch size to 7 for ~10 GB VRAM and num_repeats = 2. With your fix its running ok.

thhung commented 9 months ago

@surfingnirvana Did you end up with good results from your training?