mit-han-lab / data-efficient-gans

[NeurIPS 2020] Differentiable Augmentation for Data-Efficient GAN Training
https://arxiv.org/abs/2006.10738
BSD 2-Clause "Simplified" License
1.28k stars 175 forks source link

Training won't start #20

Closed Alex-Github-Account closed 4 years ago

Alex-Github-Account commented 4 years ago

Good day After building .tfrecods from imageset (1024x1024), I am trying to start training (tensorflow version: 1.15.2; GPU: Tesla P100-PCIE-16GB):

!python3 run_ffhq.py \
--num-gpus=1 --resolution=1024 --latent-size 512 --DiffAugment="" \
--total-kimg 50 --mirror-augment=true \
--result-dir="/path/to/dir/" --dataset="/path/to/images/with/tfrecords/aligned/" \
--resume="/path/to/stylegan2-ffhq-config-f.pkl"  --fmap-base=8192 \

training won't start with the following error:

dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
Dataset shape = [3, 1024, 1024]
Dynamic range = [0, 255]
Label size    = 0
Constructing networks...
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Loading... Done.
Traceback (most recent call last):
  File "run_ffhq.py", line 171, in <module>
    main()
  File "run_ffhq.py", line 165, in main
    run(**vars(args))
  File "run_ffhq.py", line 94, in run
    dnnlib.submit_run(**kwargs)
  File "/content/data-efficient-gans/DiffAugment-stylegan2/dnnlib/submission/submit.py", line 343, in submit_run
    return farm.submit(submit_config, host_run_dir)
  File "/content/data-efficient-gans/DiffAugment-stylegan2/dnnlib/submission/internal/local.py", line 22, in submit
    return run_wrapper(submit_config)
  File "/content/data-efficient-gans/DiffAugment-stylegan2/dnnlib/submission/submit.py", line 280, in run_wrapper
    run_func_obj(**submit_config.run_func_kwargs)
  File "/content/data-efficient-gans/DiffAugment-stylegan2/training/training_loop.py", line 162, in training_loop
    G.copy_vars_from(rG)
  File "/content/data-efficient-gans/DiffAugment-stylegan2/dnnlib/tflib/network.py", line 324, in copy_vars_from
    tfutil.set_vars(tfutil.run({self.vars[name]: src_net.vars[name] for name in names}))
  File "/content/data-efficient-gans/DiffAugment-stylegan2/dnnlib/tflib/tfutil.py", line 217, in set_vars
    run(ops, feed_dict)
  File "/content/data-efficient-gans/DiffAugment-stylegan2/dnnlib/tflib/tfutil.py", line 31, in run
    return tf.get_default_session().run(*args, **kwargs)
  File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1156, in _run
    (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (3, 3, 512, 512) for Tensor 'G_synthesis/64x64/Conv0_up/weight/new_value:0', which has shape '(3, 3, 512, 256)

I've tried different resolutions (256, 512) with the same error; Dataset verified to be correct (also unmodified stylegan2 runs and was successfully finetuned fine from it); FFHQpkl is from original soruce.

Can you please address the issue?

zsyzzsoft commented 4 years ago

Looks like the fmap-base is inconsistent. Please try --fmap-base=16384. However, finetuning from the original source will not be officially supported.