tensorflow / skflow

Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning
Apache License 2.0
3.18k stars 439 forks source link

Hands-one assistance in Embedding and logisitic regression over aggregated data #136

Closed borisRa closed 7 years ago

borisRa commented 8 years ago

Hi,

I need assistance in three issues :

  1. How to I apply the embedding only on the categorical features (I have also continuous )?
  2. How do address the following issue with Skflow : [http://stackoverflow.com/questions/33871615/train-a-model-with-probability-response-or-number-of-successes-failures-rather]
  3. How do I add the probability estimation for a success in the logistic output ?

Thanks, Boris

ilblackdragon commented 8 years ago
  1. Currently it's not very convenient - I'm working on API making it better. To do it - you need to pass everything as continuous matrix and then split it. e.g.
def my_model(X, y):
    # X - is [batch_size, n_features], where features split into n_cat + n_cont
    Xcat = tf.cast(tf.slice(X, [0, 0], [X.get_shape()[0], n_cat]), np.int64)
    Xcont = tf.slice(X, [0, n_cat], X.get_shape())

This way Xcat can be passed into categorical_variable and then combined with continues features.

Stay tuned for a better way to do it!

  1. @terrytangyuan responded on stackoverflow.
  2. Do you mean how to get probability out of the estimator for logistic output? You can just run estimator.predict_proba which will return probabilities per class instead of predicted class.

Let me know if this responds your questions!

borisRa commented 8 years ago

Thanks for the quick response !

  1. About the first one : how to combine (column bind for tf object) Xcat & Xcont back to X. To apply the deep learning models on X?
  2. I meant how to input aggregated data into the logistic regression.Instead of '1' for success and '0' for failure. input the Y attribute total number of successes and failures per aggregation level. For now there is no such support in Scikit => (https://github.com/scikit-learn/scikit-learn/issues/6496#issuecomment-193362401)

Is there a solution for this problem in Skflow ?

Thanks again ! Boris

ilblackdragon commented 7 years ago

FeatureColumns are the way to do this now. Please use recent version for Tensorflow to do this. Thanks!