microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.5k stars 4.29k forks source link

Error on SRGAN Tutorial #3202

Open grenoblien opened 6 years ago

grenoblien commented 6 years ago

I am following SRGAN tutorial using ipython notebook.

I have no problem training SRRestNet but when I get to run train_GAN function, I get following error message.

RuntimeError                              Traceback (most recent call last)
<ipython-input-24-c49cdeac2702> in <module>()
----> 1 SRGAN_model = train_GAN(SRResNet_model, real_X, real_X_scaled, real_Y, real_Y_scaled)

<ipython-input-23-5ff0ac64780d> in train_GAN(SRResNet_model, real_X, real_X_scaled, real_Y, real_Y_scaled)
     34         D_trainer_loss = D_trainer.previous_minibatch_loss_average
     35 
---> 36         G_trainer.train_minibatch(batch_inps_X_Y)
     37         pp_G.update_with_trainer(G_trainer)
     38         G_trainer_loss = G_trainer.previous_minibatch_loss_average

~\Anaconda3\lib\site-packages\cntk\train\trainer.py in train_minibatch(self, arguments, outputs, device, is_sweep_end)
    182             else:
    183                 updated = super(Trainer, self).train_minibatch(arguments, is_sweep_end,
--> 184                     device)
    185 
    186         return updated

~\Anaconda3\lib\site-packages\cntk\cntk_py.py in train_minibatch(self, *args)
   3025 
   3026     def train_minibatch(self, *args):
-> 3027         return _cntk_py.Trainer_train_minibatch(self, *args)
   3028 
   3029     def save_checkpoint(self, *args):

RuntimeError: AddNodeToNet: Duplicated name for Parameter540 LearnableParameter operation.

[CALL STACK]
    > std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::  shared_from_this
    - std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::  shared_from_this (x3)
    - CNTK::Internal::  UseSparseGradientAggregationInDataParallelSGD (x12) 

In addition, I am trying to run train_GAN with pre-trained SRRestNet model, without running

 SRResNet_model, real_X, real_X_scaled, real_Y, real_Y_scaled = train(SRResNet, LR_IMAGE_DIMS, 
                                                                       HR_IMAGE_DIMS, build_SRResNet_graph)

To do so, I ran following code.

    SRResNet_model = C.load_model(os.path.join(models_dir, "SRResNet.model"))
    (real_X, real_Y, genG, real_X_scaled, real_Y_scaled, G_optim, G_G_trainer) = build_SRResNet_graph(LR_IMAGE_DIMS, HR_IMAGE_DIMS, SRResNet)

    SRGAN_model = train_GAN(SRResNet_model, real_X, real_X_scaled, real_Y, real_Y_scaled)

But when I run this code, I get following error message.

Traceback (most recent call last):
  File "SRGAN.py", line 492, in <module>
    SRGAN_model = train_GAN(SRResNet_model, real_X, real_X_scaled, real_Y, real_Y_scaled)
  File "SRGAN.py", line 396, in train_GAN
    D_trainer.train_minibatch(batch_inps_X_Y)
  File "C:\Users\NST\Anaconda3\envs\tf-gpu\lib\site-packages\cntk\train\trainer.py", line 184, in train_minibatch
    device)
  File "C:\Users\NST\Anaconda3\envs\tf-gpu\lib\site-packages\cntk\cntk_py.py", line 3027, in train_minibatch
    return _cntk_py.Trainer_train_minibatch(self, *args)
ValueError: Values for 1 required arguments 'Input('real_X', [#], [3 x 112 x 112])', that the requested output(s) 'Output('aggregateLoss', [], []), Output('Plus29114_Output_0', [#], [1])' depend on, have not been provided.

[CALL STACK]
    > CNTK::NDMask::  MaskedCount
    - CNTK::Internal::  UseSparseGradientAggregationInDataParallelSGD
    - CNTK::Function::  Forward
    - CNTK::  CreateTrainer
    - CNTK::Trainer::  TotalNumberOfUnitsSeen
    - CNTK::Trainer::  TrainMinibatch (x2)
    - PyInit__cntk_py (x2)
    - PyCFunction_Call
    - PyObject_Init
    - PyEval_EvalFrameEx
    - PyObject_Call
    - PyObject_GenericGetAttrWithDict
    - PyEval_EvalFrameEx
    - PyObject_Call

Please Help!!

Thank you in advance

ke1337 commented 6 years ago

Please note that model arguments are different after clone. You need to feed data to the right model.arguments. The error above shows a mismatch between feed map and model.arguments.