tensorflow / skflow

Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning
Apache License 2.0
3.18k stars 439 forks source link

How could I adapt the CNN model for 1 - dimension input datasets? #51

Closed vinhqdang closed 8 years ago

vinhqdang commented 8 years ago

Hi,

I have a dataset with 24 inputs and 1 categorical output, so I am trying to adapt the example https://github.com/google/skflow/blob/master/examples/text_classification_character_cnn.py to my case.

However, in the example, I saw

byte_list = tf.reshape(skflow.ops.one_hot_matrix(X, 256), 
        [-1, MAX_DOCUMENT_LENGTH, 256, 1])

which I do not know how should I adapt to my code? Could you please help?

My data looks like:

input1 input2 ... input_n  output
2 1.2 ... -0.44 "b"
1 0.2 ... 3.2 "f"
3 1 ... 2.1 "a"
ilblackdragon commented 8 years ago

The easiest way to proceed is to start with simple TensorFlowLinearClassifier - then you don't need to do anything with your data - just pass your features as it is.

If you really want to apply convolutional networks (though in this case it doesn't seem very meaningful) - you should just skip this conversion steps and do directly conv+pool and then logistic regression on top:

def my_conv_model(X, y):
   X = tf.reshape(X, [-1, N_FEATURES, 1, 1])  # to form a 4d tensor of shape batch_size x n_features x 1 x 1
   features = skflow.ops.conv2d(X, N_FILTERS, [WINDOW_SIZE, 1], padding='VALID') # this will give you sliding window of WINDOW_SIZE x 1 convolution.
   pool = tf.squeeze(tf.reduce_max(features, 1), squeeze_dims=[1])
   return return skflow.models.logistic_regression(pool, y)

But I think you really should just concentrate on TensorFlowLinearClassifier or TensorFlowDNNClassifier - if you only have bunch of inputs variables that don't have any inherent structure. The Convolution and RNNs are for getting signal from structured data (like images or text).

vinhqdang commented 8 years ago

Thanks a lot,

When I use LinearClassifier, I achieved the accuracy 35% (without any tuning, just follow the example at homepage).

With DNNClassifier I achieved the accuracy of 40%, and with CNN you suggested I can reached to 50%. So I think CNN should be good.

However, I hope that I can improve the accuracy a little bit more. For CNN, what tuning parameters I should try to modify to see the difference?

Thanks a lot,

ilblackdragon commented 8 years ago

I should write more about tuning of the models, but here are few tips: