Open Johnson-yue opened 5 years ago
Hi,
Sorry for the delay I was on holidays.
If I'm correct you're using the DCGAN mode ? But the BCE isn't computed on G_loss + gdpp_loss, but on the score given by D.
To sum up you have: G_loss = BCE(D(G), True) + gdpp_loss (+others depending on your config) D_loss = BCE(D(G), False) + BCE(D(Image ground truth), True) (+others depending on your config)
So having negative values shouldn't make your model crash. Can you show me your configuration file ?
Yes,I'm using DCGAN for check it work, but I'm failed, I will check my configuration and reply to you
@Molugan I just following ReadMe step-to-step。
python datasets.py cifar10 $PATH_TO_CIFAR10 -o $OUTPUT_DATASET python train.py PGAN -c config_cifar10.json --restart -n cifar10 --GDPP true
and I also set _C.GDPP = True in models/trainer/standard_configurations/dcgan_config.py
But, when I enable --GDPP configuration, it has runtime error: No compute graph , please using retain_graph=True when you first backward .
I think the error occur in models/base_GAN.py because when G_loss backward the computer graph have been released, so when GDPP config is True, and GDPP backward the tensor phiGFake in the graph but gradient have been release!!
Can you add the option retain_graph=True ?
No, the source code have not
retain_graph=True
I just modified GDPP configuration set True for testing GDPP work or not
@Molugan how to use gdpp-loss??? if I want to use gdpp-loss in DCGAN : step 1: compute G_loss step 2 : G_loss.backward(retain_graph=True) step 3: compute gdpp-loss step 4: gdpp_loss.backward()
Is right?
Why did you update the two of G_loss and gdpp_loss separately??
Iin paper and original tensorflow version code, the G_loss = G_loss + gdpp_loss and only backward onece G_loss
Doing the backward in two changes does not affect the loss, though it can affect the execution time. In this case it allows a more modular architecture.
Please do not post the same message in different issues.
@Molugan oh,sorry, but when I try to gdpp loss train there are many problems, did train with gdpp loss sucessfully and improve performance?
Sorry for the delay, busy weeks with many deadlines.
I had several successful training with gdpp: it should improve the SWD score.
can you show some log for gdpp loss in training , I can not use it train any model. Error in epoch 19:
RuntimeError: reduce failed to synchronize: cudaErrorAssert: device-side assert triggered
I only train DCGAN with mnist , but it failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [81,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [82,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [83,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [84,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [85,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [86,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [87,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [88,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [89,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [90,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [91,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [92,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [93,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [94,0,0] Assertion `input >= 0. && input <= 1.` failed.
/pytorch/aten/src/THCUNN/BCECriterion.cu:42: Acctype bce_functor<Dtype, Acctype>::operator()(Tuple) [with Tuple = thrust::detail::tuple_of_iterator_references<thrust::device_reference<float>, thrust::device_reference<float>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>, Dtype = float, Acctype = float]: block: [0,0,0], thread: [95,0,0] Assertion `input >= 0. && input <= 1.` failed.
Traceback (most recent call last):
File "main.py", line 166, in <module>
main()
File "main.py", line 158, in main
gan.train()
File "/media/yue/Backup_Data/home_DeepLearning/Zi2Zi/pytorch-generative-model-collections/GAN_GDPP.py", line 198, in train
D_fake_loss = self.BCE_loss(D_fake, self.y_fake_)
File "/home/yue/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/home/yue/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 498, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
File "/home/yue/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/functional.py", line 2065, in binary_cross_entropy
input, target, weight, reduction_enum)
RuntimeError: reduce failed to synchronize: cudaErrorAssert: device-side assert triggered
Then ,I code is :
Hi, I was trying your gdpp loss with pytorch version, but when I add gdpp loss on G_loss, the training process crash, because the G_loss(~1.0) + gdpp_loss is negative. So, it can't compute BCE loss .
My question is why is gdpp loss always negative, does it make sense??