the state of DDPG - Githubissues

Tencent / PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.

https://pocketflow.github.io

Other

2.78k stars 490 forks source link

the state of DDPG #173

Closed Moran232 closed 5 years ago

Moran232 commented 5 years ago

As you mentioned that the states of DDPG are:

one-hot embedding of layer index
weight tensor's shape
number of parameters in the weight tensor
number of quantization bits used by previous layers
budget of quantization bits for remaining layers

which part of your code define the state vector and pass the state vector to the DDPG? And why you chose these 5 factors as the state?

jiaxiang-wu commented 5 years ago

It depends on which learner you are using. For instance, for WeightSparseLearner, the state is computed in learners/weight_sparsification/rl_helper.py. Basically, we follow the AMC paper (He et al., ECCV 2018) with a few modifications in defining the state vector.

Moran232 commented 5 years ago

Thanks. One more thing, can you explain what is the maskable_vars(list of maskable variables)? In learners/weight_sparsification/rl_helper.py I guess it is the layer?

jiaxiang-wu commented 5 years ago

Which line are you referring to (or, can you post the corresponding code block)?

Moran232 commented 5 years ago

class RLHelper(object): """Reinforcement learning helper for the weight sparsification learner."""

def init(self, sess, maskable_vars, skip_head_n_tail): """Constructor function.

Args:
* sess: TensorFlow session
* maskable_vars: list of maskable variables
* skip_head_n_tail: whether to skip the head & tail layers
"""

# obtain the shape & # of parameters of each maskable variable
nb_vars = len(maskable_vars)
var_shapes = []
self.prune_ratios = np.zeros(nb_vars)
self.nb_params_full = np.zeros(nb_vars)
for idx, var in enumerate(maskable_vars):
  var_shape = sess.run(tf.shape(var))
  assert var_shape.size in [2, 4], '# of variable dimensions is %d (invalid)' % var_shape.size
  if var_shape.size == 2:
    var_shape = np.hstack((np.ones(2), var_shape))
  var_shapes += [var_shape]
  self.nb_params_full[idx] = np.prod(var_shape)

maskable_vars in weight sparsification

jiaxiang-wu commented 5 years ago

For WeightSparseLearner, maskable_vars refers to:

convolutional kernels in convolutional layers (4-D tensor)
weighting matrix in dense / fully-connected layers (2-D tensor)

Moran232 commented 5 years ago

hi, I noticed that you calculate the prune_ratios in one roll-out in the function calc_rlout_actions(self) then pass it to calc_optimal_prune_ratios(self) to calculate the reward.

Does it mean the agent can only get the reward after finishing pruning all the layers?
Above functions are from weight_sparsification/pr_optimizer.py

jiaxiang-wu commented 5 years ago

Yes, the DDPG agent can only get the reward after finishing pruning all the layers. The reward depends on the classification accuracy of the model with all layers pruned with certain pruning ratios.

Moran232 commented 5 years ago

Thanks for reply As the DDPG agent can only get the reward after finishing pruning all the layers，what is the reward before finishing？Set it to zero？

jiaxiang-wu commented 5 years ago

Reward is not needed during the roll-out. Here, a roll-out refers to the process of determining pruning ratios of all layers.