grammarly / gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Apache License 2.0
901 stars 213 forks source link

Can't make the pretrained model work, even after looking at previous issues #192

Open LifeIsStrange opened 1 year ago

LifeIsStrange commented 1 year ago

Hi @komelianchuk friendly ping I have read the readme and many related issues but I can't make the pre trained model work..

I am using

python predict.py --model_path C:\Users\steph\Downloads\roberta_1_gectorv2.th --vocab_path data\output_vocabulary --input_file input_file.txt
 --output_file output_file.txt --special_tokens_fix 1 --transformer_model roberta --max_len 150

I get the following error..

Some weights of the model checkpoint at roberta-base were not used when initializing RobertaModel: ['lm_head.bias', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight', 'lm_head.dec
oder.weight', 'lm_head.dense.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassifi
cation model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model f
rom a BertForSequenceClassification model).
Namespace: d_tags
Token: INCORRECT
Traceback (most recent call last):
  File "predict.py", line 128, in <module>
    main(args)
  File "predict.py", line 47, in main
    weigths=args.weights)
  File "C:\Users\steph\PycharmProjects\gectorr\gector\gec_model.py", line 66, in __init__
    del_confidence=self.del_conf,
  File "C:\Users\steph\PycharmProjects\gectorr\gector\seq2labels_model.py", line 76, in __init__
    namespace=detect_namespace)
  File "C:\Users\steph\AppData\Local\Programs\Python\Python37\lib\site-packages\allennlp\data\vocabulary.py", line 630, in get_token_index
    return self._token_to_index[namespace][self._oov_token]
KeyError: '@@UNKNOWN@@'

setup Windows, python 3.7, I installed requirements.txt

BTW Is there a special format for the input file? Can I just put a sentence in it?

xyiiinexg3 commented 1 year ago

Hi, could u tell me how is it now?