google-research-datasets / clang8

cLang-8 is a dataset for grammatical error correction.
100 stars 5 forks source link

Errors when bash run.sh #2

Open ghost opened 3 years ago

ghost commented 3 years ago

Hi, thanks for your great work. Then, I run the following command, it works.

echo "Running a test..."
python -m prepare_clang8_dataset_test

However, when I run the following command, there are some error reports.

python -m prepare_clang8_dataset \
  --lang8_dir="${LANG8_DIR}" \
  --tokenize_text='True' \
  --languages='ru,de,en'

image

Hope for your suggestion, thank you !

ekQ commented 3 years ago

I would first check that the target files have been downloaded successfully by making sure that they contain the correct number of lines, see: https://github.com/google-research-datasets/clang8#data-format

DincyDavis commented 3 years ago

I would first check that the target files have been downloaded successfully by making sure that they contain the correct number of lines, see: https://github.com/google-research-datasets/clang8#data-format

Yeah, I got the same error and the reason was target files were not downloaded properly, it was just 1 kb files instead of original size, make sure Git Large File Storage installed as mentioned in the steps and try once.

mrqorib commented 2 years ago

Run git lfs pull before the ./run.sh