Closed staer-tan closed 5 years ago
Hi! Thank you for your reply. I have another question: if you create two models of the same DSRG architecture while the program is running, do you have any good suggestions?
Do you means two same model with different parameters? It seems that the "tf.Graph()" can help you with that configuration. But I'm not familiar with this usage.
Thank you very much for your reply. Sorry to bother you again. After reading through your code, I think using the EMA(Exponential Moving Average) method might improve the performance of the model. But I found that almost all functions are defined in a class and call parameters by using the 'self.' method in DSRG.py So,If I want to use the EMA method on your model, how can I modify the code?
Maybe you can use the EMA as following:
self.net["iteration"] = tf.placeholder(tf.float32) poly_rate = tf.pow(1-self.net["iteration"]/self.config["max_iterations"],0.9) self.net["lr"] = base_lr * poly_rate opt = tf.train.MomentumOptimizer(self.net["lr"],momentum)
and pass the iteration by feed_dict in each sess.run()
Thank you very much for your timely reply. Your method may be to use the exponential moving average to change the learning rate. If I only need to change the parameters(weight, bias) of the network by using the exponential moving average method at each iteration. how can I modify the code?
Oh, I have no experience about applying the EMA policy to the weights. And you may refer to tf.train.EMA for detailed information.
Hi @xtudbxk,thank you for your reply. The previous question has been solved. But now I have anther problem. What should I do if I want to use retrain step?
Oh, that's a wonderful message that you solved all the problems for applying the EMA policy. And I'm curious that if the EMA policy does help your experiment. For retrain step, may be you need to extract the segmentation result for all the image and then train the baseline network without dsrg under the help of those segmenation result. You can find the detailed information from here.
@xtudbxk, sorry for later reply. The most obvious feature of the model after using the EMA method is that the miou of the validation set can reach 56.5 at the 16th epoch. But the final result is still 56.8.
Thanks~
Hi @xtudbxk , Have you tried using the TensorFlow method to retrain the model? I have been building my retrain model these days, but it has not been successful so far. If you can, Could you provide the retrain code? thanks~
Hi, there some questions I'd like to ask you