wanyu2018umac / UniTE

5 stars 1 forks source link

缺少 bpe codes、词典 之类的资源 #1

Open chenweihua91 opened 2 years ago

chenweihua91 commented 2 years ago

Didn't find file /home/chenwh/QE/UniTE-models/UniTE-MUP/checkpoints/sentencepiece.bpe.model. We won't load it. Didn't find file /home/chenwh/QE/UniTE-models/UniTE-MUP/checkpoints/added_tokens.json. We won't load it. Didn't find file /home/chenwh/QE/UniTE-models/UniTE-MUP/checkpoints/special_tokens_map.json. We won't load it. Didn't find file /home/chenwh/QE/UniTE-models/UniTE-MUP/checkpoints/tokenizer_config.json. We won't load it. Traceback (most recent call last): File "score.py", line 121, in main() File "score.py", line 75, in main model = load_from_checkpoint(model_path, cfg.hparams_file_path) File "/home/chenwh/QE/UniTE/comet/models/init.py", line 64, in load_from_checkpoint model = model_class.load_from_checkpoint(checkpoint_path, hparams) File "/usr/chenwh/tools/lib/python3.7/site-packages/pytorch_lightning/core/saving.py", line 161, in load_from_checkpoint model = cls._load_model_state(checkpoint, strict=strict, kwargs) File "/usr/chenwh/tools/lib/python3.7/site-packages/pytorch_lightning/core/saving.py", line 203, in _load_model_state model = cls(*_cls_kwargs) File "/home/chenwh/QE/UniTE/comet/models/regression/regression_metric.py", line 115, in init "regression_metric", File "/home/chenwh/QE/UniTE/comet/models/base.py", line 87, in init self.hparams.pretrained_model File "/home/chenwh/QE/UniTE/comet/encoders/xlmr.py", line 49, in from_pretrained return XLMREncoder(pretrained_model) File "/home/chenwh/QE/UniTE/comet/encoders/xlmr.py", line 36, in init self.tokenizer = XLMRobertaTokenizer.from_pretrained(pretrained_model) File "/usr/chenwh/anaconda3/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 898, in from_pretrained return cls._from_pretrained(inputs, **kwargs) File "/usr/chenwh/anaconda3/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 1003, in _from_pretrained list(cls.vocab_files_names.values()), OSError: Model name '/home/chenwh/QE/UniTE-models/UniTE-MUP/checkpoints' was not found in tokenizers model name list (xlm-roberta-base, xlm-roberta-large, xlm-roberta-large-finetuned-conll02-dutch, xlm-roberta-large-finetuned-conll02-spanish, xlm-roberta-large-finetuned-conll03-english, xlm-roberta-large-finetuned-conll03-german). We assumed '/home/chenwh/QE/UniTE-models/UniTE-MUP/checkpoints' was a path, a model identifier, or url to a directory containing vocabulary files named ['sentencepiece.bpe.model'] but couldn't find such vocabulary files at this path or url.

您好,我在尝试跑您提供的模型的时候发现缺少bpe codes、词典之类的资源,能提供下吗,最近在学习QE相关的知识,想试试看,期待您的回复,谢谢!

wanyu2018umac commented 2 years ago

Hi,

Sorry for the late response!

This repo is for evaluation only. Right now I'm figuring out how to merge the training repo and evaluation repo into one.

Right now you can use the training repo in the training branch: https://github.com/wanyu2018umac/UniTE/tree/training

Both for training and evaluation, it is required to present the path/url for XLM-RoBERTa-large. You can download it into your own hard drive, or just use the url to identify it. For more information, you can refer to COMET repository, which is the prototype of my repository.

Wish this helps. Thanks!

Yu WAN