NTMC-Community / MatchZoo

Facilitating the design, comparison and sharing of deep text matching models.
Apache License 2.0
3.82k stars 898 forks source link

"Error: Index out of bounds for axix 0" While making prediction with DRMM #820

Open thiziri opened 4 years ago

thiziri commented 4 years ago

Describe the bug

I've trained and evaluated the DRMM model successfully. When I traied to make predictions in a new dataset, I run this code:

test_generator = mz.DataGenerator(data_pack=valid_pack_pp[:10], mode='point', callbacks=[hist_callback])
test_x, test_y = test_generator[:]
prediction = drmm_model.predict(test_x)

and I've got the following error message:


IndexError Traceback (most recent call last)

in 1 test_generator = mz.DataGenerator(data_pack=valid_pack_pp, mode='point', callbacks=[hist_callback]) ----> 2 test_x, test_y = test_generator[:] 3 prediction = drmm_model.predict(test_x) C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\data_generator\data_generator.py in __getitem__(self, item) 133 self._handle_callbacks_on_batch_data_pack(batch_data_pack) 134 x, y = batch_data_pack.unpack() --> 135 self._handle_callbacks_on_batch_unpacked(x, y) 136 return x, y 137 C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\data_generator\data_generator.py in _handle_callbacks_on_batch_unpacked(self, x, y) 197 def _handle_callbacks_on_batch_unpacked(self, x, y): 198 for callback in self._callbacks: --> 199 callback.on_batch_unpacked(x, y) 200 201 @property C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\data_generator\callbacks\histogram.py in on_batch_unpacked(self, x, y) 32 def on_batch_unpacked(self, x, y): 33 """Insert `match_histogram` to `x`.""" ---> 34 x['match_histogram'] = _build_match_histogram(x, self._match_hist_unit) 35 36 C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\data_generator\callbacks\histogram.py in _build_match_histogram(x, match_hist_unit) 62 x['length_right'].tolist()) 63 for pair in zip(text_left, text_right): ---> 64 match_hist.append(match_hist_unit.transform(list(pair))) 65 return np.asarray(match_hist) C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\preprocessors\units\matching_histogram.py in transform(self, input_) 47 matching_hist = np.ones((len(text_left), self._hist_bin_size), 48 dtype=np.float32) ---> 49 embed_left = self._embedding_matrix[text_left] 50 embed_right = self._embedding_matrix[text_right] 51 matching_matrix = embed_left.dot(np.transpose(embed_right)) IndexError: index 740 is out of bounds for axis 0 with size 385

To Reproduce

Here is a data sample:

valid_pack_pp = preprocessor.transform(valid_pack)
valid_pack_pp.frame()

image

Describe your attempts

@yangliuy @pl8787 @wordreference @zenogantner @faneshion someone could help, please?

faneshion commented 4 years ago

Did you process the train/valid/test data use the same Preprocessor?

thiziri commented 4 years ago

Normally yes :/