Closed czt616 closed 3 years ago
Let me elaterate a little bit:
In Matchzoo, everything is configurable. Which means models could be used for classification (binary) or ranking (relevance degree), it's deppend on the task type.
To be more concrete, let's see DRMM model, line 88
x_out = self._make_output_layer()(flatten_score)
The output is based on the function called _make_output_layer
, it's defined in engine/base_model
, line 508
def _make_output_layer(self) -> keras.layers.Layer:
""":return: a correctly shaped keras dense layer for model output."""
task = self._params['task']
if isinstance(task, tasks.Classification):
# Softmax kernel produce binary output.
return keras.layers.Dense(task.num_classes, activation='softmax')
elif isinstance(task, tasks.Ranking):
# Linear kernel produce relevance degree.
return keras.layers.Dense(1, activation='linear')
else:
raise ValueError(f"{task} is not a valid task type."
f"Must be in `Ranking` and `Classification`.")
If you define a ranking
task, it will produce the relevance degree, binary for classification. So I guess what you expected is:
ranking_task = mz.tasks.Ranking(loss=mz.losses.RankCrossEntropyLoss())
ranking_task.metrics = [
mz.metrics.NormalizedDiscountedCumulativeGain(k=3),
mz.metrics.MeanAveragePrecision()
]
# You Initialize a DRMM model
# ..
model.params['task'] = ranking_task
# ...
Back to your questions:
can I change the value to other value between 0 and 1 to represent different relevance of the document?
Yes, create a ranking task and use it as a parameter of DRMM model.
I added some new documents with a label of 0.75 to wikiqa train set. However, DRMM model still could be trained. Why is this happening?
The output of model is depdendent on your task
type, not the model itself. The model
is more about the architecture.
I am using MatchZoo 2.1 to training some models, and I have some questions. When I train the drmm model for document ranking, the label of documents in tutorial is either 0 or 1. I am wondering if I use the cross entropy loss, can I change the value to other value between 0 and 1 to represent different relevance of the document? I added some new documents with a label of 0.75 to wikiqa train set. However, DRMM model still could be trained. Why is this happening?