about conv1d - Githubissues

wuxiyu commented 6 years ago

[batch, in_width, in_channels] 问下，这里是直接把embedding的长度当成通道数吗？如果是用conv2d呢？

gaussic commented 6 years ago

这个问题其实应该通过自行尝试来找到答案，不过在这里我还是做一些声明：

import tensorflow as tf
input_x = tf.placeholder(tf.int32, [None, 400], name='input_x')

此时，我们的输入为长度为400的文档（即400个字）。

with tf.device('/cpu:0'):
    embedding = tf.get_variable('embedding', [5000, 50])
    embedding_inputs = tf.nn.embedding_lookup(embedding, input_x)

print(embedding_inputs)

词汇表的大小为5000，词向量维度为50，得到的输出如下：

Tensor("embedding_lookup:0", shape=(?, 400, 50), dtype=float32, device=/device:CPU:0)

可以看到，embedding_inputs的维度为(batch, seq_len, embedding_size)

再加上conv1d层：

conv = tf.layers.conv1d(embedding_inputs, filters=256, kernel_size=5, name='conv1')
print(conv)

256个卷积核，尺寸为5，输出conv的尺寸是：

Tensor("conv1/BiasAdd:0", shape=(?, 396, 256), dtype=float32)

由此可以看到，seq_len从400变为了396，embedding_size从50变为了256。

所以说，conv1d是对于时序进行卷积的，把seq_len作为卷积对象，而embedding_size是作为输入通道的。

现在，要把它还原到conv2d应该怎么做呢？请自行查阅官方API来找到答案。这就是一个学习和自我提升的过程。

wuxiyu commented 6 years ago

首先多谢回答。问题在于：我是查阅了api后才发现的这个问题。tf.layers.conv1d

inputs: Tensor input.

我转向了类似的函数tf.nn.conv1d:

Internally, this op reshapes the input tensors and invokes tf.nn.conv2d

a tensor of shape [batch, in_width, in_channels] is reshaped to [batch, 1, in_width, in_channels]

也就是做了一次reshape。后者tensor的输入格式为[batch, in_height, in_width, in_channels]，也就是得到了我一开始的疑问：把embedding的长度当成通道数。（或者是作者写代码时传参的bug）我的疑惑是，是否有除去Convolutional Neural Networks for Sentence Classification之外，别的对于时序文本的卷积方法（比如用通道数）。而该文的做法是seq_len当做in_height，而embedding长度当做in_width，设置通道数为1。

我上面的理解也不知道对不对。再次感谢作者回答这么多。 ps，上纲上线到国人写程序，我觉得也有失偏颇。

gaussic commented 6 years ago

按照你的理解，如果尊崇论文的描述，我们是不能使用conv1d的，因为tf.nn.conv1d将输入由NWC转换为了NHWC，而正确的做法应该是从NHW转换为NHWC，意思就是如果严格的按照论文，我们必须使用tf.nn.conv2d，这个conv1d是错误的。如果从这个角度看，我并不否认你的观点，甚至应该给予极大的肯定，因为我也通过翻阅资料学到了一些内容。

不过，在README.md中我提到：

本文是基于TensorFlow在中文数据集上的简化实现，使用了字符级CNN和RNN对中文文本进行分类，达到了较好的效果。

也就是说，这个repo并不是严格的尊崇论文的实现，原因如下：

采用的是char level的实现，而非原文的word level。
使用了conv1d，以embedding_dim作为channel。
原文中采用了多个kernel size，这个项目中只有一个5。

虽然未使用论文中的架构，但是从实验结果来看，conv1d还是起到了很大的作用的。使用conv1d的类似样例来自于 Keras 的 imdb_cnn.py。

此外，我觉得tensorflow的官方文档也有些不当的地方，当提到一维时序卷积时，我觉得通道的这个概念应该换掉，而Keras关于Conv1D的描述可能更为恰当：

1D convolution layer (e.g. temporal convolution).

Input shape

3D tensor with shape: (batch_size, steps, input_dim)

Output shape

3D tensor with shape: (batch_size, new_steps, filters) steps value might have changed due to padding or strides.

PS：如果觉得我有些上纲上线，我并不做全盘的否认。但是这个现象还是普遍存在的，至少在你点开Closed issues的时候就能发现一些。对于每一个项目，我更希望听到的是可塑性的建议，而不是不断的索取额外的功能。

wuxiyu commented 6 years ago

感谢回答（并提供了一个Keras的例子）。看到了一个类似的讨论Implementation Comparison on conv1d and conv2d for SM and MP-CNN 作者提到

Great observation. I have one question, in NLP can each dimension in the sentence embedding be referred to as a channel? If so does this imply we can use Conv1d? However, I'm aware many tutorials by reputable people use Conv2d for text CNNs.

是的，我也是大概这个意思，dimension是否可以看作channel。对时序进行卷积。看样子理论上应该是可以。

作者也提到了

As a simple example, this figure, the filter is in 2d but in the code we have it in 1D.

（虽然图已经失效了）

ps：我这个issue也没有索取额外的功能啊，从某种意义上，我倒觉得是一种探讨。最后ps：其实论文里也考虑过只用一个kernel size：4.1 Multichannel vs. Single Channel Models。

34127chi commented 6 years ago

@wuxiyu conv1d先转换为[batch, 1, in_width,embedding_size]，再调用的conv2d。感觉与conv2直接用在[batch, in_width,embedding_size， 1]上没什么区别，可能就只是前面说的通道不同。但是最终学习的filter参数不都应该一样的吗？不知道我想的对不对，期待得到回复

Mywayking commented 5 years ago

calling conv1d (from tensorflow.python.ops.nn_ops) with data_format=NHWC is deprecated and will be removed in a future version.
Instructions for updating:
`NHWC` for data_format is deprecated, use `NWC` instead1

conv1d运行时候出现上面提示，请问大家怎么改写去掉？

a6840231 commented 5 years ago

calling conv1d (from tensorflow.python.ops.nn_ops) with data_format=NHWC is deprecated and will be removed in a future version.
Instructions for updating:
`NHWC` for data_format is deprecated, use `NWC` instead1

conv1d运行时候出现上面提示，请问大家怎么改写去掉？

NWC，让你改为[batch, width, channel]的shape格式

andiac commented 3 years ago

@wuxiyu conv1d先转换为[batch, 1, in_width,embedding_size]，再调用的conv2d。感觉与conv2直接用在[batch, in_width,embedding_size， 1]上没什么区别，可能就只是前面说的通道不同。但是最终学习的filter参数不都应该一样的吗？不知道我想的对不对，期待得到回复

你想的对

gaussic / text-classification-cnn-rnn

about conv1d #18

Input shape

Output shape