nusnlp / mlconvgec2018

Code and model files for the paper: "A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction" (AAAI-18).
GNU General Public License v3.0
185 stars 73 forks source link

reranker error #25

Closed wulouzhu closed 5 years ago

wulouzhu commented 5 years ago

When I run./run.sh conlltest2014/test2014.en conlltest2014_reranker/ 0,1 models/mlconv models/reranker_weights/mlconv_embed_4ens_eo_lm.weights.txt 'eolm', something error happended:

++ python2 /home/nlp/WJF/mlconvgec2018/software/nbest-reranker/rerank.py -i conlltest2014_reranker//output.tok.nbest.reformat.augmented.txt -w models/reranker_weights/mlconv_embed_4ens_eo_lm.weights.txt -o conlltest2014_reranker/ --clean-up
[INFO] [02-08-2019 10:43:35]  Arguments:
[INFO] [02-08-2019 10:43:35]    clean_up: True
[INFO] [02-08-2019 10:43:35]    command: /home/nlp/WJF/mlconvgec2018/software/nbest-reranker/rerank.py -i conlltest2014_reranker//output.tok.nbest.reformat.augmented.txt -w models/reranker_weights/mlconv_embed_4ens_eo_lm.weights.txt -o conlltest2014_reranker/ --clean-up
[INFO] [02-08-2019 10:43:35]    input_nbest: conlltest2014_reranker//output.tok.nbest.reformat.augmented.txt
[INFO] [02-08-2019 10:43:35]    out_dir: conlltest2014_reranker/
[INFO] [02-08-2019 10:43:35]    quiet: False
[INFO] [02-08-2019 10:43:35]    weights: models/reranker_weights/mlconv_embed_4ens_eo_lm.weights.txt
Traceback (most recent call last):
  File "/home/nlp/WJF/mlconvgec2018/software/nbest-reranker/rerank.py", line 72, in <module>
    output_1best.write(group[sorted_indices[0]].hyp + "\n")
IndexError: list index out of range

I use fairseq0.5,torch0.4. What is the problem? Thank you.

wulouzhu commented 5 years ago

A sample in output.tok.nbest.reformat.augmented.txt : -1 ||| Keeping the Secret of Genetic Testing ||| F0= -0.14867456257343292 EditOps0= 0 1 5 ||| -0.14867456257343292 -1 ||| Keeping the Secret of the Genetic Testing ||| F0= -0.30441999435424805 EditOps0= 0 2 5 ||| -0.30441999435424805 -1 ||| Maintain the Secret of Genetic Testing ||| F0= -0.5522223711013794 EditOps0= 0 1 5 ||| -0.5522223711013794 -1 ||| To keep the Secret of the Genetic Testing ||| F0= -0.5662410855293274 EditOps0= 0 3 5 ||| -0.5662410855293274 -1 ||| To keep the Secret of Genetic Testing ||| F0= -0.5695184469223022 EditOps0= 0 2 5 ||| -0.5695184469223022 -1 ||| Keeping the Secret of Genetic Testing . ||| F0= -0.6078863143920898 EditOps0= 0 2 5 ||| -0.6078863143920898 -1 ||| Maintaining the Secret of Genetic Testing ||| F0= -0.6108198165893555 EditOps0= 0 1 5 ||| -0.6108198165893555 -1 ||| Keeping the Secret for the Genetic Testing ||| F0= -0.61934494972229 EditOps0= 0 2 5 ||| -0.61934494972229 -1 ||| Keeping the secret of the Genetic Testing ||| F0= -0.6822504997253418 EditOps0= 0 2 5 ||| -0.6822504997253418 -1 ||| To maintain the Secret of Genetic Testing ||| F0= -0.7254775762557983 EditOps0= 0 2 5 ||| -0.7254775762557983 -1 ||| Keeping the Secret of the Genetic Testing . ||| F0= -0.7767990827560425 EditOps0= 0 3 5 ||| -0.7767990827560425 -1 ||| The Secret of Genetic Testing ||| F0= -0.8103262782096863 EditOps0= 0 0 5 ||| -0.8103262782096863

shamilcm commented 5 years ago

Seems like the LM scores have not been added. Check if you have downloaded the required language model file and have nbest-reranker downloaded.

wulouzhu commented 5 years ago

@shamilcm I have downloaded the langugae model "94Bcclm.trie" and nbest-reranker before. When I run ./run.sh conlltest2014/test2014.en conlltest2014_reranker/ 0,1 models/mlconv models/reranker_weights/mlconv_embed_4ens_eo_lm.weights.txt 'eolm' . The same error happended. One of samples in in output.tok.nbest.reformat.augmented.txt is:

-1 ||| Keeping the Secret of Genetic Testing ||| F0= -0.14867456257343292 EditOps0= 0 1 5   LM0= -40.0306  WordPenalty0= -6  ||| -0.14867456257343292
-1 ||| Keeping the Secret of the Genetic Testing ||| F0= -0.30441999435424805 EditOps0= 0 2 5   LM0= -46.3525  WordPenalty0= -7  ||| -0.30441999435424805
-1 ||| Maintain the Secret of Genetic Testing ||| F0= -0.5522223711013794 EditOps0= 0 1 5   LM0= -45.0082  WordPenalty0= -6  ||| -0.5522223711013794
-1 ||| To keep the Secret of the Genetic Testing ||| F0= -0.5662410855293274 EditOps0= 0 3 5   LM0= -50.3051  WordPenalty0= -8  ||| -0.5662410855293274
-1 ||| To keep the Secret of Genetic Testing ||| F0= -0.5695184469223022 EditOps0= 0 2 5   LM0= -43.9831  WordPenalty0= -7  ||| -0.5695184469223022
-1 ||| Keeping the Secret of Genetic Testing . ||| F0= -0.6078863143920898 EditOps0= 0 2 5   LM0= -41.7383  WordPenalty0= -7  ||| -0.6078863143920898
-1 ||| Maintaining the Secret of Genetic Testing ||| F0= -0.6108198165893555 EditOps0= 0 1 5   LM0= -44.7836  WordPenalty0= -6  ||| -0.6108198165893555
-1 ||| Keeping the Secret for the Genetic Testing ||| F0= -0.61934494972229 EditOps0= 0 2 5   LM0= -47.7624  WordPenalty0= -7  ||| -0.61934494972229
-1 ||| Keeping the secret of the Genetic Testing ||| F0= -0.6822504997253418 EditOps0= 0 2 5   LM0= -41.3848  WordPenalty0= -7  ||| -0.6822504997253418
-1 ||| To maintain the Secret of Genetic Testing ||| F0= -0.7254775762557983 EditOps0= 0 2 5   LM0= -45.4764  WordPenalty0= -7  ||| -0.7254775762557983
wulouzhu commented 5 years ago

In the rerank.py, I print the size of group, it is 0.

counter = 0
for group in input_aug_nbest:
    print('length of group:',group.size())
    index = 0
    scores = dict()
    for item in group:
        features = np.asarray([x for x in item.features.split() if is_number(x)], dtype=float)
shamilcm commented 5 years ago

If you are using Fairseq 0.5, did you make sure you are using the branch 'fairseq0.5'. The nbest-reformat script needs to be modified as well. See this commit: https://github.com/nusnlp/mlconvgec2018/commit/486864f3b7907db1898c34eab2b462becd9a98ef

wulouzhu commented 5 years ago

Thank you very much!