cuixue commented 7 years ago

现在 cnn 和 rnn 的数据输入都是一个batch_size，一个batch_size的。但是有个问题，所有数据的最后一个batch可能已经不足一个batch_size的大小了。怎么办呢？？？如果是matconvnet，最后一个batch 可以大小不如batch_size的。我看tutorial的处理是，最后一个就不处理了。那测试时候呢，也不处理了？tutorial给的样例不是很好。可能我对于tensorflow读的代码比较少，尤其在lstm方面，需要预定batch大小，state_init_R = tf.tile(init_R, [batch_size, 1]) ，这里必须要指定batch_size的大小。我问了下，theano这方面是比较灵活的。我看你既用了theano，也用了tensorflow。应该了解的比较深入。这个问题困扰我一段时间了，没有找到比较好的办法，请问你怎么看呢？谢谢！

white127 commented 7 years ago

建议你了解下随机梯度下降，batch_size和这个相关的没有必要每次的batch_size都固定不变，而且你可以每次生成batch_size都随机采样来生成

在 2016年10月19日下午9:39，cuixue notifications@github.com写道：

现在 cnn 和 rnn 的数据输入都是一个batch_size，一个batch_size的。但是有个问题，所有数据的最后一个batch可能已经不足一个batch_size的大小了。怎么办呢？？？如果是matconvnet，最后一个batch 可以大小不如batch_size的。我看tutorial的处理是，最后一个就不处理了。那测试时候呢，也不处理了？tutorial给的样例不是很好。可能我对于tensorflow读的代码比较少，尤其在lstm方面，需要预定batch大小，state_init_R = tf.tile(init_R, [batch_size, 1]) ，这里必须要指定batch_size的大小。我问了下， theano这方面是比较灵活的。我看你既用了theano，也用了tensorflow。应该了解的比较深入。这个问题困扰我一段时间了，没有找到比较好的办法，请问你怎么看呢？谢谢！

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/white127/insuranceQA-cnn-lstm/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/ABffUPnOLAiZzwHr-IbaHJGKRbr3ufo3ks5q1h2dgaJpZM4Ka-2z .

cuixue commented 7 years ago

sgd我了解的。我想知道的是在你的知识内，tf可以处理数据的个数最后不足一个batch的情况么？

460130107 commented 7 years ago

可以处理的。 batch_size的设置仅仅是每次训练批量读取的训练样本x的个数，这个是不影响lstm模型训练的。当然需要合适的batch_size和learning_rate来保证模型的快速有效收敛。

white127 commented 7 years ago

最后一个batch数据不足，可以补充其他数据呀，这样就能处理了，如果说非要处理不同大小batch的数据，也可以啊，看你的代码支不支持动态的batch

2016-10-25 14:22 GMT+08:00 cuixue notifications@github.com:

sgd我了解的。我想知道的是在你的知识内，tf可以处理数据的个数最后不足一个batch的情况么？

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/white127/insuranceQA-cnn-lstm/issues/5#issuecomment-255947150, or mute the thread https://github.com/notifications/unsubscribe-auth/ABffUKPjDkLP_7ck-WRwI7goerZ_KaqUks5q3aASgaJpZM4Ka-2z .

white127 commented 7 years ago

tf框架能不能处理动态的batch我记不得了，我都用的固定大小

在 2016年10月25日下午5:16，jiangwen jiang jiangwen127@gmail.com写道：

最后一个batch数据不足，可以补充其他数据呀，这样就能处理了，如果说非要处理不同大小batch的数据，也可以啊，看你的代码支不支持动态的batch

2016-10-25 14:22 GMT+08:00 cuixue notifications@github.com:

sgd我了解的。我想知道的是在你的知识内，tf可以处理数据的个数最后不足一个batch的情况么？

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/white127/insuranceQA-cnn-lstm/issues/5#issuecomment-255947150, or mute the thread https://github.com/notifications/unsubscribe-auth/ABffUKPjDkLP_7ck-WRwI7goerZ_KaqUks5q3aASgaJpZM4Ka-2z .

cuixue commented 7 years ago

这是我代码的一部分。我找了一些资料，说是可以根据输入的维度变batch大小。。。但是要合理的初始化，能通过tf的。比如下面是我的代码， input_data = tf.placeholder(tf.int32, [None, num_steps]) with tf.variable_scope('forward'): cellL = tf.nn.rnn_cell.BasicLSTMCell(hidden_size, forget_bias=1.0) state_init_L = tf.get_variable("init_L",initializer=cellL.zero_state(tf.shape(input_data)[0],tf.float32))

报错说：ValueError: initial_value must have a shape specified: Tensor("model/init_variable_L/zeros:0", shape=(?, 100), dtype=float32, device=/device:GPU:0)

谢谢~

white127 commented 7 years ago

initial_value must have a shape specified 这不是说初始值必须是一个具体的shape吗，tf里面要用变长的shape，我觉得挺麻烦的，之前好像也尝试过，不好弄，就没继续研究了

2016-10-26 21:40 GMT+08:00 cuixue notifications@github.com:

这是我代码的一部分。我找了一些资料，说是可以根据输入的维度变batch大小。。。但是要合理的初始化，能通过tf的。比如下面是我的代码， input_data = tf.placeholder(tf.int32, [None, num_steps]) with tf.variable_scope('forward'): cellL = tf.nn.rnn_cell.BasicLSTMCell(hidden_size, forget_bias=1.0) state_init_L = tf.get_variable("init_L",initializer=cellL.zero_state( tf.shape(input_data)[0],tf.float32))

报错说：ValueError: initial_value must have a shape specified: Tensor("model/init_variable_L/zeros:0", shape=(?, 100), dtype=float32, device=/device:GPU:0)

谢谢~

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/white127/insuranceQA-cnn-lstm/issues/5#issuecomment-256350569, or mute the thread https://github.com/notifications/unsubscribe-auth/ABffUE6auWamtMcuCol3pU3uVyrur7y3ks5q31hlgaJpZM4Ka-2z .

xiaorongfan commented 7 years ago

最近我也在用TENSORFLOW,我来说两句吧。使用可变batch我没实验过，不方便发言。不过看了文章，说可行，下次我试了再说。对于你之前提出的问题，就是测试时如果保证使用全部的数据，有俩个解决办法。第一个就是上面说的，你在feed数据的时候，如果遇到末尾的数据，可以根据batch size和num steps补充适当的数据，比如循环使用开头的数据。第二是是我用的方法，我在测试的时候，把batch size设为1，然后我用了变长·num steps,这样可以保证在测试时保证能覆盖所有的数据。当然我使用的模型是变长sequence模型。希望能对你有点用。

cuixue commented 7 years ago

嗯，我在实验中发现，如果我每个epoch 不shuffle数据的话，效果会比每个epoch shuffle数据的效果好。不知道你们有没有遇到这种情况。因此，我决定不采用shuffle，可是这样的话，如果网络不能变batch，就意味着一些训练数据用不到了。至于在测试阶段，这个补充点数据就好了。可是，我现在想的是解决问题，而不是如何能避免问题。我问了下，好像在 theano上可以很好的使用变batch的。。。而且我现在想着这个问题后台实现应该不是很难的，只是每次求梯度要考虑batch大小的问题，tensorflow 我看了下文档，说是可以解决的。但就像我在代码中遇到的问题一样，我想应该有解决方法可以解决。或者我换一种写法，tensorflow就通过了呢，只是我现有的知识还不能解决这个问题。

xiaorongfan commented 7 years ago

如果想使用可变动的batch_size，下面的语句我试过可以用。就是每次FEED的时候要赋值 self.batch_size = tf.placeholder(tf.int32, [])

xyzhang16 commented 6 years ago

@xiaorongfan 这种方式可以！但是有些情况，还是要把batch_size指定为具体的值，否则报错。例如：


decoder = tf.contrib.seq2seq.BasicDecoder(decoder_cell, 
                                          helper, 
                                          initial_state=decoder_cell.zero_state(batch_size, tf.float32).clone(cell_state=encoder_state), 
                                          output_layer=projection_layer)```
如果batch_size为占位符的话，会报错的。 
个人觉得处理可变batch_size最好的做法，就是将batch_size定义为超参数。然后在inference时，将不足一个batch的补齐

bringtree commented 6 years ago

但是你预测的时候（不是训练），怎么办总不能把不足的丢掉吧。

bringtree commented 6 years ago

处理成[None] 不知道会不会报错我晚上试一试

wwwzrb commented 6 years ago

我也遇到同样问题，在预测的时候如果我要预测单个的label怎么办？如果batach太小的话theano会因为优化报异常！

bringtree commented 6 years ago

@wwwzrb tf 设置成[None] 是可以用的

huige555551 commented 6 years ago

@cuixue 你可以在tf中设置成这样：

Features and Labels

features = tf.placeholder(tf.float32, [None, n_input]) labels = tf.placeholder(tf.float32, [None, n_classes]) 那个None就相当于每次feed进去的batchsize，是一个用tf.placeholder()函数接收的变量，比如128,256等，如果最后一个batch不足128,比如是104,可以feed进去哪个None那里。

LiuQL2 commented 5 years ago

tf的RNN里面在初始化状态的时候需要指定batch size的大小，而batch size的大小在测试和训练的时候应该不一样的，也会出现说在训练的时候最后一个batch并不一定就是batch_size大小，这样tf里面就会报错，反正研究了一段时间还是没解决。楼上几位说可以把batch_size弄成placeholder的，测试了一下，好像并不可以，这里还在等待大神的方法

bringtree commented 5 years ago

@LiuQL2 就这样无视我的话吗- -。直接变量设置成None。不用placeholder

bringtree commented 5 years ago

@LiuQL2 大佬有在研究LDA？

mynewstart commented 4 years ago

如果想使用可变动的batch_size，下面的语句我试过可以用。就是每次FEED的时候要赋值 self.batch_size = tf.placeholder(tf.int32, [])

对，我也试过，这样是可以的

JianhaoLuo commented 4 years ago

我在用keras做文本分类时也遇到数据大小不能被batch_size整除报错的问题，请问该怎么解决啊？

white127 / QA-deep-learning

你好，请教关于batch_size的问题 #5

Features and Labels