Closed echan00 closed 3 years ago
Hello,
I tried to run alignments using the provided model (w/out train_co) and the example data (zhen.src-tgt), but am receiving an error as shown below:
train_co
DATA_FILE=./examples/zhen.src-tgt MODEL_NAME_OR_PATH=./model_without_co/pytorch_model.bin OUTPUT_FILE=./output/zhen.awesome-align.out CUDA_VISIBLE_DEVICES=0 python3 run_align.py \ --output_file=$OUTPUT_FILE \ --model_name_or_path=$MODEL_NAME_OR_PATH \ --data_file=$DATA_FILE \ --extraction 'softmax' \ --batch_size 32 \ Traceback (most recent call last): File "run_align.py", line 194, in <module> main() File "run_align.py", line 167, in main config = config_class.from_pretrained(args.model_name_or_path, cache_dir=args.cache_dir) File "/Users/xxx/Downloads/awesome-align-master/configuration_utils.py", line 175, in from_pretrained config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs) File "/Users/xxx/Downloads/awesome-align-master/configuration_utils.py", line 227, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "/Users/xxx/Downloads/awesome-align-master/configuration_utils.py", line 313, in _dict_from_json_file text = reader.read() File "/Users/xxx/.pyenv/versions/3.7.9/lib/python3.7/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
Sorry, resolved now.
It should be the model path and not the model .bin file: MODEL_NAME_OR_PATH=./model_without_co
Hello,
I tried to run alignments using the provided model (w/out
train_co
) and the example data (zhen.src-tgt), but am receiving an error as shown below: