Not getting good results while using my own Condition Network

asadabbas09 commented 6 years ago

I'm trying to use my own condition network and visualize some neurons in conv5_2layer. The network had different layer names so I changed self.fc_layers and self.conv_layers in sampling_class.py. I also changed 3_hidden_conditional_sampling.sh accordingly as well.

I tried sweeping across epsilon1, epsilon2, epsilon3 and learning rates parameters and performed 5000 iterations, but network fails to generate good images with high output probabilities.

I'm not sure if I am sweeping across right parameters, some of the parameters that I have tried are:

lr=(0.0005 0.005 0.05 1) 
epsilon1=(5 1 1e-1 1e-3 1e-7 1e-11 1e-15)
epsilon2=(0.00001 0.0001 0.05 0.5 1 2)
epsilon3=(5 1 1e-1 1e-3 1e-7 1e-11)

Is there anything else I need to change to get it working for my own condition network or How would you recommend to proceed in this case?

anguyen8 commented 6 years ago

@asadabbas09 : Thanks for your question! What is the architecture of your condition network? (e.g. GoogleNet or something?)

asadabbas09 commented 6 years ago

It's a VGG16 network trained for face recognition.

anguyen8 commented 6 years ago

@asadabbas09 : usually, for hidden neurons it's very easy to make them reach high activations / probabilities. Are you able to get high probabilities at all with e3=0?

asadabbas09 commented 6 years ago

@anguyen8 I've tried using e3=0 as well. but still, condition_unit probabilities aren't getting high. Most of the time best_unit repeats a pattern and if I set condition_unit to best_unit according to the pattern (248 in given case) I do get some high activations, but generally, there's no convergence.

for condition_unit 55

step: 0479   max:   19 [0.49]    obj:   55 [0.00000000]  norm: [0.00]
step: 0480   max:   19 [0.52]    obj:   55 [0.00000000]  norm: [0.00]
step: 0481   max:   19 [0.58]    obj:   55 [0.00000000]  norm: [0.00]
step: 0482   max:   21 [0.54]    obj:   55 [0.00000000]  norm: [0.00]
step: 0483   max:   21 [0.78]    obj:   55 [0.00000000]  norm: [0.00]
step: 0484   max:   21 [0.82]    obj:   55 [0.00000000]  norm: [0.00]
step: 0485   max:   21 [0.88]    obj:   55 [0.00000000]  norm: [0.00]
step: 0486   max:   21 [0.81]    obj:   55 [0.00000000]  norm: [0.00]
step: 0487   max:   21 [0.54]    obj:   55 [0.00000000]  norm: [0.00]
step: 0488   max:  248 [0.40]    obj:   55 [0.00000000]  norm: [0.00]
step: 0489   max:  248 [0.51]    obj:   55 [0.00000000]  norm: [0.00]
step: 0490   max:  248 [0.69]    obj:   55 [0.00000000]  norm: [0.00]
step: 0491   max:  248 [0.83]    obj:   55 [0.00000000]  norm: [0.00]
step: 0492   max:  248 [0.84]    obj:   55 [0.00000000]  norm: [0.00]
step: 0493   max:  248 [0.85]    obj:   55 [0.00000000]  norm: [0.00]
step: 0494   max:  248 [0.89]    obj:   55 [0.00000000]  norm: [0.00]
step: 0495   max:  248 [0.91]    obj:   55 [0.00000000]  norm: [0.00]
step: 0496   max:  248 [0.91]    obj:   55 [0.00000000]  norm: [0.00]
step: 0497   max:  248 [0.91]    obj:   55 [0.00000000]  norm: [0.00]
step: 0498   max:  248 [0.89]    obj:   55 [0.00000000]  norm: [0.00]
step: 0499   max:  248 [0.88]    obj:   55 [0.00000000]  norm: [0.00]
step: 0500   max:  248 [0.87]    obj:   55 [0.00000000]  norm: [0.00]

If I set condition_unit to 248, again I get similar pattern but some samples for 248th neuron at the end.

step: 0479   max:   19 [0.49]    obj:  248 [0.00000011]  norm: [0.00]
step: 0480   max:   19 [0.52]    obj:  248 [0.00000035]  norm: [0.00]
step: 0481   max:   19 [0.58]    obj:  248 [0.00000137]  norm: [0.00]
step: 0482   max:   21 [0.54]    obj:  248 [0.00001180]  norm: [0.00]
step: 0483   max:   21 [0.78]    obj:  248 [0.00012804]  norm: [0.00]
step: 0484   max:   21 [0.82]    obj:  248 [0.00143779]  norm: [0.00]
step: 0485   max:   21 [0.88]    obj:  248 [0.02325740]  norm: [0.00]
step: 0486   max:   21 [0.81]    obj:  248 [0.11571622]  norm: [0.00]
step: 0487   max:   21 [0.54]    obj:  248 [0.29134911]  norm: [0.00]
step: 0488   max:  248 [0.40]    obj:  248 [0.40160143]  norm: [0.00]
step: 0489   max:  248 [0.51]    obj:  248 [0.50645238]  norm: [0.00]
step: 0490   max:  248 [0.69]    obj:  248 [0.68800771]  norm: [0.00]
step: 0491   max:  248 [0.83]    obj:  248 [0.82597047]  norm: [0.00]
step: 0492   max:  248 [0.84]    obj:  248 [0.84389746]  norm: [0.00]
step: 0493   max:  248 [0.85]    obj:  248 [0.85142034]  norm: [0.00]
step: 0494   max:  248 [0.89]    obj:  248 [0.89233971]  norm: [0.00]
step: 0495   max:  248 [0.91]    obj:  248 [0.90517473]  norm: [0.00]
step: 0496   max:  248 [0.91]    obj:  248 [0.90765899]  norm: [0.00]
step: 0497   max:  248 [0.91]    obj:  248 [0.90551907]  norm: [0.00]
step: 0498   max:  248 [0.89]    obj:  248 [0.89001757]  norm: [0.00]
step: 0499   max:  248 [0.88]    obj:  248 [0.87751538]  norm: [0.00]
step: 0500   max:  248 [0.87]    obj:  248 [0.87093037]  norm: [0.00]

I've also tried couple of other caffe models, only one of them (having two outptut classes) converges and that for output neurons only. I must be missing something, as places205 and bvlc_reference_caffenet models works for many different values of e1, e2, e3 or lr.

anguyen8 commented 6 years ago

@asadabbas09 : One of the known problems is optimization becomes less effective when a neuron is in a deep layer (e.g. in ResNet). It's harder to get such neuron highly activated... however, this case VGG is not that deep. I guess it must be something specific to this model that you're using.

Evolving-AI-Lab / ppgn

Not getting good results while using my own Condition Network #11