lezhang-thu / HelloWorld

Sentinel
2 stars 0 forks source link

tensorflow control logic notes #1

Open lezhang-thu opened 7 years ago

lezhang-thu commented 7 years ago

import tensorflow as tf

def cond(x):
    return tf.less(x,10)

def body(x):
    tf.get_variable_scope().reuse_variables()
    y=tf.get_variable('y')
    z=tf.get_variable('z')
    assign_op=tf.assign(z,value=tf.concat([z,tf.constant(0)],0))

    add_op=tf.assign_add(y,1)
    with tf.control_dependencies([add_op,assign_op]):
        return x+1

x=tf.get_variable('x',[],initializer=tf.constant_initializer(0))
y=tf.get_variable('y',[],initializer=tf.constant_initializer(0))
z=tf.get_variable('z',[],initializer=tf.constant_initializer(0),dtype=tf.int32)
loop=tf.while_loop(cond,body,[x])

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(y))
    # add=tf.assign_add(x,1)
    # sess.run(add)
    sess.run(loop)
    print(sess.run(y))
lezhang-thu commented 7 years ago

from __future__ import print_function

import tensorflow as tf

def cond(sequence_len, step):
    return tf.less(step,sequence_len)

def body(sequence_len, step): 

    begin = tf.get_variable("begin",[3],dtype=tf.int32,initializer=tf.constant_initializer(0))
    begin = tf.scatter_update(begin,1,step,use_locking=None)

    tf.get_variable_scope().reuse_variables()

    with tf.control_dependencies([begin]):
        return (sequence_len, step+1)

with tf.Graph().as_default():

    sess = tf.Session()
    step = tf.constant(0)
    sequence_len  = tf.constant(10)
    _,step, = tf.while_loop(cond,
                    body,
                    [sequence_len, step], 
                    parallel_iterations=10, 
                    back_prop=True, 
                    swap_memory=False, 
                    name=None)

    begin = tf.get_variable("begin",[3],dtype=tf.int32)

    init = tf.initialize_all_variables()
    sess.run(init)

    print(sess.run([begin,step]))
lezhang-thu commented 7 years ago

import tensorflow as tf

a=tf.placeholder(tf.int32,shape=[None])
b=tf.placeholder(tf.int32,shape=[None])

b_shape=tf.shape(b)

def cond(x):
    return tf.less(x,b_shape[0])

def body(x):
    # if i>0: tf.get_variable_scope().reuse_variables()
    index=tf.get_variable('index')
    length=tf.reshape(tf.slice(b,index,1),[])
    # c=tf.get_variable('result')
    c=tf.concat(c,
            tf.reduce_sum(tf.slice(index,length-1)),
            0)
    with tf.control_dependencies([c]):
        add_op=tf.assign_add(index,length)

    with tf.control_dependencies([add_op]):
        return x+1

x=tf.get_variable('x',[],initializer=tf.constant_initializer(0))
index=tf.get_variable('index',[],initializer=tf.constant_initializer(0))
c=tf.get_variable('result', shape=[],initializer=tf.constant_initializer(0,dtype=tf.int32))

loop=tf.while_loop(cond,body,[x])

tf.get_variable_scope().reuse_variables()
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(c))
    sess.run(loop,feed_dict={a:[0,1,2,3,4,5,6,7,8,9],b:[2,3,5]})
    print(sess.run(c))
lezhang-thu commented 7 years ago

https://stackoverflow.com/questions/37063081/how-does-the-tf-scatter-update-work-inside-the-while-loop

https://stackoverflow.com/questions/39859516/how-to-update-a-subset-of-2d-tensor-in-tensorflow?rq=1

https://github.com/tensorflow/tensorflow/issues/4638

https://stackoverflow.com/questions/37441140/how-to-use-tf-while-loop-in-tensorflow

lezhang-thu commented 7 years ago

It’s easier to think about LSTM as a function that maps a pair of vectors (state, input) to a pair of vectors (state, output), and those vectors generally have the same dimensionality. What actually happens inside depends on the parameters learned by the cell.

When you have several cells stacked, it’s the same thing as a sequential application of several functions of the same type but with different parameters. It’s not so different from a multi-layer perceptron. The purpose of using multilayer RNN cells is to learn more sophisticated conditional distributions (such as in neural machine translation (Bahdanau et al. 2014)).

In a single layer RNN, the output is produced by passing it through a single hidden state which fails to capture hierarchical (think temporal) structure of a sequence. With a multi-layered RNN, such structure is captured which results in better performance.

Compare RNNs with a deep neural network (such as CNN) for image recognition. Through visualization research, we know that each layer in the network captures structure. For example the initial layers find edges in an image, or identify color of image. The later layers build upon this for complex structure such as finding intersection of edges or shades of colors. The final layer then brings all of this together to identify the object in the image.

In a single layered RNN, you have one hidden state doing all the work. So it is overwhelmed. If you are modeling a sequence such as text, then the parameters are learning that ‘a’ is more likely to follow ‘c’ than ‘o’. By introducing multiple layers however, you offer the RNN to capture structure. The first layer might learn that some characters are vowels and others are consonants. The second layer would build on this to learn that a vowel is more likely to follow a consonant.


state = cell.zero_state(...)
outputs = []
for input_ in inputs:
  output, state = cell(input_, state)
  outputs.append(output)
return (outputs, state)

上面解释了outputs为什么会加上s. 注意其中不断地append.

注意muupan的github在Adversarial Attacks on Neural Network Policies这篇paper中有refer. github.com/muupan/async-rl 关于DQN, refer的是github.com/spragunr/deep_q_rl 但muupan引用了miyosuda的github, 而后者就是unreal的reproducer.


    # Simplified version of models/tutorials/rnn/rnn.py's rnn().
    # This builds an unrolled LSTM for tutorial purposes only.
    # In general, use the rnn() or state_saving_rnn() from rnn.py.
    #
    # The alternative version of the code below is:
    #
    # inputs = tf.unstack(inputs, num=num_steps, axis=1)
    # outputs, state = tf.contrib.rnn.static_rnn(
    #     cell, inputs, initial_state=self._initial_state)
    outputs = []
    state = self._initial_state
    with tf.variable_scope("RNN"):
      for time_step in range(num_steps):
        if time_step > 0: tf.get_variable_scope().reuse_variables()
        (cell_output, state) = cell(inputs[:, time_step, :], state)
        outputs.append(cell_output)

Note: 以上表达了rnn的具体实现过程。Good! 请联系tensorpack中的实现,

        def get_v(n):
            return tf.get_variable(n, [BATCH, HIDDEN_SIZE],
                                   trainable=False,
                                   initializer=tf.constant_initializer())
        self.state = state_var = \
            (rnn.LSTMStateTuple(get_v('c0'), get_v('h0')),
             rnn.LSTMStateTuple(get_v('c1'), get_v('h1')))

请注意以上code中的[BATCH, HIDDEN_SIZE]. 这样便可明白!