NTMC-Community / MatchZoo

Facilitating the design, comparison and sharing of deep text matching models.
Apache License 2.0
3.82k stars 898 forks source link

Is there a tutorial on the DIIN model? #822

Open ouyaya opened 4 years ago

ouyaya commented 4 years ago

When I called the model according to the hyperparameters given by the model, a problem occurred“ValueError: Layer weight shape (10000, 300) not compatible with provided weight shape (33905, 300)”. So I changed the hyperparameter 'embedding_input_dim' to 33905, but another problem appeared“tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[1200,0] = 184 is not in [0, 100) [[{{node time_distributed_1_1/embedding_1/embedding_lookup}}]]”

uduse commented 4 years ago

821 Does this help?

ouyaya commented 4 years ago

I referenced these hyperparameters, but found that the following hyperparameter reports an error. model.params ['embedding_input_dim'] = 10000 An error message indicates that this hyperparameter should be set to 33905. However, another error occurred after I changed it to 33905. I was wondering if the code in the other calling part is wrong.So do you have the tutorial code to call this model correctly?

uduse commented 4 years ago

@ouyaya Did you use the correct preprocessor?

ouyaya commented 4 years ago

Yes, I used the processing method in diin_preprocessor.py

uduse commented 4 years ago

I need more information, can you provide a Minimal, Reproducible Example?

ouyaya commented 4 years ago
# -*- coding: UTF-8 -*-
import keras
import pandas as pd
import numpy as np
import matchzoo as mz
import json
print('matchzoo version', mz.__version__)
print()

print('data loading ...')
train_pack_raw = mz.datasets.wiki_qa.load_data('train', task='ranking')
dev_pack_raw = mz.datasets.wiki_qa.load_data('dev', task='ranking', filtered=True)
test_pack_raw = mz.datasets.wiki_qa.load_data('test', task='ranking', filtered=True)
print('data loaded as `train_pack_raw` `dev_pack_raw` `test_pack_raw`')

ranking_task = mz.tasks.Ranking(loss=mz.losses.RankHingeLoss())
ranking_task.metrics = [
    mz.metrics.NormalizedDiscountedCumulativeGain(k=3),
    mz.metrics.NormalizedDiscountedCumulativeGain(k=5),
    mz.metrics.MeanAveragePrecision()
]
print("`ranking_task` initialized with metrics", ranking_task.metrics)

print("loading embedding ...")
glove_embedding = mz.datasets.embeddings.load_glove_embedding(dimension=300)
print("embedding loaded as `glove_embedding`")

diin_preprocessor = mz.preprocessors.DIINPreprocessor(fixed_length_left=32, fixed_length_right=32, fixed_length_word=16)
diin_preprocessor = diin_preprocessor.fit(train_pack_raw,verbose=0)
train_pack_processed = diin_preprocessor.transform(train_pack_raw,verbose=0)
dev_pack_processed = diin_preprocessor.transform(dev_pack_raw,verbose=0)
test_pack_processed = diin_preprocessor.transform(test_pack_raw,verbose=0)

model = mz.contrib.models.DIIN()
model.guess_and_fill_missing_params()
model.params['embedding_input_dim'] = 10000
model.params['embedding_output_dim'] = 300
model.params['embedding_trainable'] = True
model.params['optimizer'] = 'adam'
model.params['dropout_initial_keep_rate'] = 1.0
model.params['dropout_decay_interval'] = 10000
model.params['dropout_decay_rate'] = 0.977
model.params['char_embedding_input_dim'] = 100
model.params['char_embedding_output_dim'] = 8
model.params['char_conv_filters'] = 100
model.params['char_conv_kernel_size'] = 5
model.params['first_scale_down_ratio'] = 0.3
model.params['nb_dense_blocks'] = 3
model.params['layers_per_dense_block'] = 8
model.params['growth_rate'] = 20
model.params['transition_scale_down_ratio'] = 0.5
model.build()
model.compile()
model.backend.summary()

embedding_matrix = glove_embedding.build_matrix(diin_preprocessor.context['vocab_unit'].state['term_index'])
model.load_embedding_matrix(embedding_matrix)

pred_x, pred_y = test_pack_processed[:].unpack()
evaluate = mz.callbacks.EvaluateAllMetrics(model, x=pred_x, y=pred_y, batch_size=len(pred_y))

train_generator = mz.DataGenerator(
    train_pack_processed,
    mode='pair',
    num_dup=2,
    num_neg=1,
    batch_size=20
)
print('num batches:', len(train_generator))

history = model.fit_generator(train_generator, epochs=30, callbacks=[evaluate], workers=0, use_multiprocessing=True)

matchzoo_diin.txt When I run this calling code, I get the following error: Traceback (most recent call last): File "E:/Paper/keras_wiki/matchzoo_diin.py", line 60, in model.load_embedding_matrix(embedding_matrix) File "F:\Python\lib\site-packages\matchzoo\engine\base_model.py", line 469, in load_embedding_matrix self.get_embedding_layer(name).set_weights([embedding_matrix]) File "F:\Python\lib\site-packages\keras\engine\base_layer.py", line 1126, in set_weights 'provided weight shape ' + str(w.shape)) ValueError: Layer weight shape (10000, 300) not compatible with provided weight shape (33905, 300)

uduse commented 4 years ago

In addition to the other change you made, also set the embedding input dimension for char embedding:

char_input_dim = len(diin_preprocessor.context['char_unit'].state['term_index'])
model.params['char_embedding_input_dim'] = char_input_dim
ouyaya commented 4 years ago

Thank you for your answer, this problem has been solved according to your prompt. But when I use the wiki dataset to train this model, the loss value will become higher and higher. I do n’t know if it is a problem with the model code or a task I deal with. The task I deal with is ‘ranking’.

uduse commented 4 years ago

The reason that the model is in contrib is it is not fully tested, so I am not sure why this is happening. @caiyinqiong is the author and maybe she has something to say?

ouyaya commented 4 years ago

Do I need to tune the hyperparameters without hyperspace in the given model for the new dataset?

uduse commented 4 years ago

If a parameter don't have a hyperspace then it won't be tuned and it will just use the default value. It's always a good idea to tune your models on new datasets. You can do this by either manually adjusting things, or using the auto tuner.

ouyaya commented 4 years ago

I mean, do I need to add hyperspace to those hyperparameters that do not have hyperspace, and then use the auto-tuner to adjust all the parameters? If I adjust all the parameters, will it affect the initial model mentioned in the paper, because I want to do comparison experiments with my own paper model.

uduse commented 4 years ago

@ouyaya If you just want a baseline, then yes, don't add extra hyperspaces, just use the default ones. You don't even need to tune the model, just run the model as it is. In addition, you can also tune the model then call the model a "fine-tuned" baseline.