Closed ramakumar1729 closed 5 years ago
Yes, it is possible. transform_fn
would be an appropriate place to convert your features to latent vectors. Whether you jointly train such representations with the ranker depends on your problem.
Yes, I was thinking about jointly training a document embedding. I have pairwise labels (A > B, etc). For each labeled pair (A,B), I'll lookup their embeddings (A_emb, B_emb) and use that as the document features. This would replace classical LTR query-document features (I don't have any queries in my context anyways). Not sure what you mean w/ the transform_fn
but I'll research the code a bit.
Here's my example (modified from tf ranking example) of using an embedding to learn a latent factor model:
from tensorflow.feature_column import categorical_column_with_identity, embedding_column
_BUCKETS=10000
_K=10
def make_score_fn(buckets):
"""Returns a scoring function to build `EstimatorSpec`."""
def _score_fn(context_features, group_features, mode, params, config):
"""Defines the network to score a documents."""
del params
del config
item_id = categorical_column_with_identity(key='item_id', num_buckets=_BUCKETS, default_value=0)
item_emb = embedding_column(item_id, _K)
input_layer = tf.feature_column.input_layer(group_features, item_emb)
cur_layer = input_layer
for i, layer_width in enumerate(int(d) for d in _HIDDEN_LAYER_DIMS):
cur_layer = tf.layers.dense(cur_layer, units=layer_width, activation="relu")
logits = tf.layers.dense(cur_layer, units=1) #regression
return logits
return _score_fn
I modified my input function to only return an item_id
which is an integer number that maps to an embedding. But it could be concerted w/ any other arbitrary features into fixed length vector for the FC layers.
This gets good results on my ranking task for MRR:
baseline: .54 LF model: .76
Thanks for sharing this example, Alex. This looks great. If you wish, you could define the feature columns outside, so that you can also use them to make the parsing_spec
to read in tf.Examples
or tf.SequenceExamples
.
When you say "define your feature columns outside", you mean like in the notebook example, where there is a example_feature_columns
function which is called from the _score_fn
function?
Also, I don't understand what the transform_fn
function is for. Can you provide an example?
Hello, I would just like to chime in that having an example of using feature columns where group_size and feature dimension is not 1 would be helpful. I can use a groupwise feature tensor directly with dimension [?, group_size (10), feature_dimension (180)]
but when I put it in a numeric feature column with shape [group_size, feature_dimension]
I get the following error on this line in _groupwise_dnn_v2
:
scores = _score_fn(
large_batch_context_features, large_batch_group_features, reuse=False)
error is:
ValueError: Dimensions must be equal, but are 1800 and 180 for 'groupwise_dnn_v2/group_score/rescale/mul' (op: 'Mul') with input shapes: [?,1800], [180].
I think it has something to do with the shape to the feature column but I'm unsure what's the issue here.
FWIW I ran into the same issue as @darlliu
Please check out the demo on using sparse features and embeddings in TF-Ranking. You can click on the colab link to start executing the content of the notebook.
Tensorboard is integrated into the notebook, and you can use it to visualize the eval and loss curves.
Feel free to post your feedback by responding on this issue.
Thanks for posting some concrete examples in this new notebook. Some questions:
SequenceExample
. What are some of the motivations on moving to the new format?_NUM_TRAIN_STEPS = 15 * 1000
tf.contrib.layers.optimize_loss
. It was a nice abstraction, however it was probably replaced b/c contrib is being depreciated in tf 2.0?And also, I think it would be nice to have the hyper params
passed to the transform_fn
just like the group_score_fn
so you can do something like this:
def example_feature_columns(params):
rest_id = categorical_column_with_identity(key='rid', num_buckets=item_buckets)
rest_emb = embedding_column(rest_id, params.K)
return {"rid": rest_emb}
def make_transform_fn():
def _transform_fn(features, mode, params):
"""Defines transform_fn."""
example_name = next(six.iterkeys(example_feature_columns(params)))
input_size = tf.shape(input=features[example_name])[1]
context_features, example_features = tfr.feature.encode_listwise_features(
features=features,
input_size=input_size,
context_feature_columns=context_feature_columns(),
example_feature_columns=example_feature_columns(),
mode=mode,
scope="transform_layer")
return context_features, example_features
return _transform_fn
See how I added params
to the signature of _transform_fn
so now it can accept hyper params like the group_score_fn
?
The use case for this is the common one where the embedding dimensions are a hyper parameter.
Would you accept a PR from me for this?
Hi Alex, great set of questions. Please find my replies inline.
Thanks for posting some concrete examples in this new notebook. Some questions:
I see a new data format EIE whereas the previous examples seemed to use
SequenceExample
. What are some of the motivations on moving to the new format? The primary motivation for EIE is that it has the per-query and per-document (which we call context and example features) in self contained tf.Examples. SequenceExample represents in a feature major format, and EIE represents in a document major format. Does this make sense?Also what is the intuition behind this?:
_NUM_TRAIN_STEPS = 15 * 1000
The number of training steps are kept low for the sake of demonstration. This should show the curves in Tensorboard in around 15 mins. You can assign this to any number you feel appropriate."The transform function takes in the raw dense or sparse features from the input reader, applies suitable transformations to return dense representations for each feature" I never understood what the transform function where for until reading this line. I always just made my features in the scoring function. The transform function applies the transformation for all the documents only once, while the logic in group score function is applied over each group. Think of the transform function as a "pre-precocessing" before sending to scoring function.
I noticed you moved away from
tf.contrib.layers.optimize_loss
. It was a nice abstraction, however it was probably replaced b/c contrib is being depreciated in tf 2.0? Yes, that is exactly the reason. Our repository is TF2.0 alpha compatible.
I like this PR suggestion. Please go ahead with this. One thing to keep in mind is that you will need to change the model_fn builder, which expects only (features, mode) argument. See this line for more details.
And also, I think it would be nice to have the hyper
params
passed to thetransform_fn
just like thegroup_score_fn
so you can do something like this:def example_feature_columns(params): rest_id = categorical_column_with_identity(key='rid', num_buckets=item_buckets) rest_emb = embedding_column(rest_id, params.K) return {"rid": rest_emb} def make_transform_fn(): def _transform_fn(features, mode, params): """Defines transform_fn.""" example_name = next(six.iterkeys(example_feature_columns(params))) input_size = tf.shape(input=features[example_name])[1] context_features, example_features = tfr.feature.encode_listwise_features( features=features, input_size=input_size, context_feature_columns=context_feature_columns(), example_feature_columns=example_feature_columns(), mode=mode, scope="transform_layer") return context_features, example_features return _transform_fn
See how I added
params
to the signature of_transform_fn
so now it can accept hyper params like thegroup_score_fn
?The use case for this is the common one where the embedding dimensions are a hyper parameter.
Would you accept a PR from me for this?
Would it be possible to learn a ranker from pairwise data where the features are latent factors (w/o any hand made features)? Like a matrix factorization model?
So the input to the pairwise loss is the respective embeddings for two documents you are ranking...