HillZhang1999 / SynGEC

Code & data for our EMNLP2022 paper "SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser"
https://arxiv.org/abs/2210.12484
MIT License
79 stars 14 forks source link

How can I get the data? #27

Open YeJinPaark opened 1 year ago

YeJinPaark commented 1 year ago

First, I downloaded 'Transformer-en' and renamed it like './model/syngec/english_transformer_baseline.pt'. Then, I downloaded the preprocessed data.

And I run the code './pipeline_gopar.sh'. But the error is:

Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 17, in input_sentences = load(sys.argv[1]) File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 9, in load with open(filename, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt' Loading resources... Processing parallel files... Traceback (most recent call last): File "/opt/conda/bin/errant_parallel", line 8, in sys.exit(main()) File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in main in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor] File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor] FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/convert_gec_data_to_parsing_data_english.py", line 153, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt.conll_predict' /opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use-env is set by default in torchrun. If your script expects --local-rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions

warnings.warn( WARNING:torch.distributed.run:


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


/opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 30676) of binary: /opt/conda/bin/python Traceback (most recent call last): File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 196, in main() File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 192, in main launch(args) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 177, in launch run(args) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

supar.cmds.biaffine_dep FAILED

Failures: [1]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 1 (local_rank: 1) exitcode : 1 (pid: 30677) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [2]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 2 (local_rank: 2) exitcode : 1 (pid: 30678) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [3]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 3 (local_rank: 3) exitcode : 1 (pid: 30679) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [4]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 4 (local_rank: 4) exitcode : 1 (pid: 30680) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [5]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 5 (local_rank: 5) exitcode : 1 (pid: 30681) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [6]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 6 (local_rank: 6) exitcode : 1 (pid: 30682) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [7]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 7 (local_rank: 7) exitcode : 1 (pid: 30683) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 0 (local_rank: 0) exitcode : 1 (pid: 30676) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out'

How can I fix it? plz help me...

HillZhang1999 commented 1 year ago

I notice that the file path in your error message seems strange, such as ``/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py''. Please try to enter the corresponding directory and then re-run the bash file.

YeJinPaark commented 1 year ago

Ok, I'll try soon

Then, I wonder that the how can I get the data like:

FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt'

HillZhang1999 commented 1 year ago

You should download the preprocessed data, unzip them, and put them into https://github.com/HillZhang1999/SynGEC/tree/main/data

YeJinPaark commented 1 year ago

Is that preprocessed data same the link of data: https://drive.google.com/file/d/1dIDfYhELrh3BEKgGpsPYAy5ehcobmMov/view

So I downloaded the data and unzip ./data/ but I got the error like

Apply BPE... ./preprocess_syngec_transformer.sh: line 22: ../../data/clang8_train/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 23: ../../data/clang8_train/tgt.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 24: ../../data/bea19_dev/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 25: ../../data/bea19_dev/tgt.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt' cp: cannot stat '../../data/clang8_train/src.txt': No such file or directory cp: cannot stat '../../data/clang8_train/src.txt.bpe': No such file or directory cp: cannot stat '../../data/clang8_train/tgt.txt': No such file or directory cp: cannot stat '../../data/clang8_train/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/clang8_train/src.txt.swm': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.conll_predict_gopar' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar' cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np.probs': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished! Apply BPE... ./preprocess_syngec_transformer.sh: line 112: ../../data/error_coded_train/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 113: ../../data/error_coded_train/tgt.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 114: ../../data/bea19_dev/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 115: ../../data/bea19_dev/tgt.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt' cp: cannot stat '../../data/error_coded_train/src.txt': No such file or directory cp: cannot stat '../../data/error_coded_train/src.txt.bpe': No such file or directory cp: cannot stat '../../data/error_coded_train/tgt.txt': No such file or directory cp: cannot stat '../../data/error_coded_train/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/error_coded_train/src.txt.swm': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.conll_predict_gopar' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar' cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np.probs': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished! Apply BPE... ./preprocess_syngec_transformer.sh: line 202: ../../data/wi_locness_train/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 203: ../../data/wi_locness_train/tgt.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 204: ../../data/bea19_dev/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 205: ../../data/bea19_dev/tgt.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt' cp: cannot stat '../../data/wi_locness_train/src.txt': No such file or directory cp: cannot stat '../../data/wi_locness_train/src.txt.bpe': No such file or directory cp: cannot stat '../../data/wi_locness_train/tgt.txt': No such file or directory cp: cannot stat '../../data/wi_locness_train/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/wi_locness_train/src.txt.swm': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.conll_predict_gopar' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar' cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np.probs': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished! Apply BPE... ./preprocess_syngec_transformer.sh: line 290: ../../data/conll14_test/src.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt' cp: cannot stat '../../data/conll14_test/src.txt': No such file or directory cp: cannot stat '../../data/conll14_test/src.txt.bpe': No such file or directory cp: cannot stat '../../data/conll14_test/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.swm' cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.conll_predict_gopar' cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished! Apply BPE... ./preprocess_syngec_transformer.sh: line 358: ../../data/bea19_test/src.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt' cp: cannot stat '../../data/bea19_test/src.txt': No such file or directory cp: cannot stat '../../data/bea19_test/src.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_test/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.swm' cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.conll_predict_gopar' cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished!

HillZhang1999 commented 1 year ago

please enter the directory of this bash file, then run cat ../../data/clang8_train/src.txt, check whether there is actually a file. If not, please check the way you unzip the data.

YeJinPaark commented 1 year ago

First I use the unzip like "tar -zxvf syngec_preprocess.tar.gz"

and then the log is preprocess/ preprocess/chinese_hsk+lang8_with_syntax_transformer/ preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/ preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.label.txt preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.src.txt preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.tgt.txt preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/preprocess.log preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.conll.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.conll.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.probs.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.probs.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.tgt.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.tgt.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.tgt.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.tgt.idx preprocess/chinese_mucgec_with_syntax_transformer/ preprocess/chinese_mucgec_with_syntax_transformer/bin/ preprocess/chinese_mucgec_with_syntax_transformer/bin/dict.label.txt preprocess/chinese_mucgec_with_syntax_transformer/bin/dict.src.txt preprocess/chinese_mucgec_with_syntax_transformer/bin/preprocess.log preprocess/chinese_mucgec_with_syntax_transformer/bin/test.conll.src-tgt.src.bin preprocess/chinese_mucgec_with_syntax_transformer/bin/test.conll.src-tgt.src.idx preprocess/chinese_mucgec_with_syntax_transformer/bin/test.dpd.src-tgt.src.bin preprocess/chinese_mucgec_with_syntax_transformer/bin/test.dpd.src-tgt.src.idx preprocess/chinese_mucgec_with_syntax_transformer/bin/test.probs.src-tgt.src.bin preprocess/chinese_mucgec_with_syntax_transformer/bin/test.probs.src-tgt.src.idx preprocess/chinese_mucgec_with_syntax_transformer/bin/test.src-tgt.src.bin preprocess/chinese_mucgec_with_syntax_transformer/bin/test.src-tgt.src.idx preprocess/english_bea19_with_syntax_bart/ preprocess/english_bea19_with_syntax_bart/bin/ preprocess/english_bea19_with_syntax_bart/bin/dict.label.txt preprocess/english_bea19_with_syntax_bart/bin/dict.src.txt preprocess/english_bea19_with_syntax_bart/bin/preprocess.log preprocess/english_bea19_with_syntax_bart/bin/test.conll.src-tgt.src.bin preprocess/english_bea19_with_syntax_bart/bin/test.conll.src-tgt.src.idx preprocess/english_bea19_with_syntax_bart/bin/test.dpd.src-tgt.src.bin preprocess/english_bea19_with_syntax_bart/bin/test.dpd.src-tgt.src.idx preprocess/english_bea19_with_syntax_bart/bin/test.probs.src-tgt.src.bin preprocess/english_bea19_with_syntax_bart/bin/test.probs.src-tgt.src.idx preprocess/english_bea19_with_syntax_bart/bin/test.src-tgt.src.bin preprocess/english_bea19_with_syntax_bart/bin/test.src-tgt.src.idx preprocess/english_bea19_with_syntax_transformer/ preprocess/english_bea19_with_syntax_transformer/dict.label.txt preprocess/english_bea19_with_syntax_transformer/dict.src.txt preprocess/english_bea19_with_syntax_transformer/preprocess.log preprocess/english_bea19_with_syntax_transformer/test.conll.src-tgt.src.bin preprocess/english_bea19_with_syntax_transformer/test.conll.src-tgt.src.idx preprocess/english_bea19_with_syntax_transformer/test.dpd.src-tgt.src.bin preprocess/english_bea19_with_syntax_transformer/test.dpd.src-tgt.src.idx preprocess/english_bea19_with_syntax_transformer/test.probs.src-tgt.src.bin preprocess/english_bea19_with_syntax_transformer/test.probs.src-tgt.src.idx preprocess/english_bea19_with_syntax_transformer/test.src-tgt.src.bin preprocess/english_bea19_with_syntax_transformer/test.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/ preprocess/english_clang8_with_syntax_bart/bin/ preprocess/english_clang8_with_syntax_bart/bin/dict.label.txt preprocess/english_clang8_with_syntax_bart/bin/dict.src.txt preprocess/english_clang8_with_syntax_bart/bin/dict.tgt.txt preprocess/english_clang8_with_syntax_bart/bin/preprocess.log preprocess/english_clang8_with_syntax_bart/bin/train.conll.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/train.conll.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/train.dpd.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/train.dpd.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/train.probs.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/train.probs.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.tgt.bin preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.tgt.idx preprocess/english_clang8_with_syntax_bart/bin/valid.conll.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/valid.conll.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/valid.probs.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/valid.probs.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.tgt.bin preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.tgt.idx preprocess/english_clang8_with_syntax_transformer/ preprocess/english_clang8_with_syntax_transformer/bin/ preprocess/english_clang8_with_syntax_transformer/bin/dict.label.txt preprocess/english_clang8_with_syntax_transformer/bin/dict.src.txt preprocess/english_clang8_with_syntax_transformer/bin/dict.tgt.txt preprocess/english_clang8_with_syntax_transformer/bin/preprocess.log preprocess/english_clang8_with_syntax_transformer/bin/train.conll.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/train.conll.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/train.probs.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/train.probs.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.tgt.bin preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.tgt.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.tgt.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.tgt.idx preprocess/english_conll14_with_syntax_bart/ preprocess/english_conll14_with_syntax_bart/bin/ preprocess/english_conll14_with_syntax_bart/bin/dict.label.txt preprocess/english_conll14_with_syntax_bart/bin/dict.src.txt preprocess/english_conll14_with_syntax_bart/bin/preprocess.log preprocess/english_conll14_with_syntax_bart/bin/test.conll.src-tgt.src.bin preprocess/english_conll14_with_syntax_bart/bin/test.conll.src-tgt.src.idx preprocess/english_conll14_with_syntax_bart/bin/test.dpd.src-tgt.src.bin preprocess/english_conll14_with_syntax_bart/bin/test.dpd.src-tgt.src.idx preprocess/english_conll14_with_syntax_bart/bin/test.probs.src-tgt.src.bin preprocess/english_conll14_with_syntax_bart/bin/test.probs.src-tgt.src.idx preprocess/english_conll14_with_syntax_bart/bin/test.src-tgt.src.bin preprocess/english_conll14_with_syntax_bart/bin/test.src-tgt.src.idx preprocess/english_conll14_with_syntax_transformer/ preprocess/english_conll14_with_syntax_transformer/dict.label.txt preprocess/english_conll14_with_syntax_transformer/dict.src.txt preprocess/english_conll14_with_syntax_transformer/preprocess.log preprocess/english_conll14_with_syntax_transformer/test.conll.src-tgt.src.bin preprocess/english_conll14_with_syntax_transformer/test.conll.src-tgt.src.idx preprocess/english_conll14_with_syntax_transformer/test.dpd.src-tgt.src.bin preprocess/english_conll14_with_syntax_transformer/test.dpd.src-tgt.src.idx preprocess/english_conll14_with_syntax_transformer/test.probs.src-tgt.src.bin preprocess/english_conll14_with_syntax_transformer/test.probs.src-tgt.src.idx preprocess/english_conll14_with_syntax_transformer/test.src-tgt.src.bin preprocess/english_conll14_with_syntax_transformer/test.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/ preprocess/english_error_coded_with_syntax_bart/bin/ preprocess/english_error_coded_with_syntax_bart/bin/dict.label.txt preprocess/english_error_coded_with_syntax_bart/bin/dict.src.txt preprocess/english_error_coded_with_syntax_bart/bin/dict.tgt.txt preprocess/english_error_coded_with_syntax_bart/bin/preprocess.log preprocess/english_error_coded_with_syntax_bart/bin/train.conll.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/train.conll.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/train.dpd.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/train.dpd.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/train.probs.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/train.probs.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.tgt.bin preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.tgt.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.conll.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.conll.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.probs.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.probs.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.tgt.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.tgt.idx preprocess/english_error_coded_with_syntax_transformer/ preprocess/english_error_coded_with_syntax_transformer/bin/ preprocess/english_error_coded_with_syntax_transformer/bin/dict.label.txt preprocess/english_error_coded_with_syntax_transformer/bin/dict.src.txt preprocess/english_error_coded_with_syntax_transformer/bin/dict.tgt.txt preprocess/english_error_coded_with_syntax_transformer/bin/preprocess.log preprocess/english_error_coded_with_syntax_transformer/bin/train.conll.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.conll.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/train.probs.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.probs.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.tgt.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.tgt.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.tgt.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.tgt.idx preprocess/english_wi_locness_with_syntax_bart/ preprocess/english_wi_locness_with_syntax_bart/bin/ preprocess/english_wi_locness_with_syntax_bart/bin/dict.label.txt preprocess/english_wi_locness_with_syntax_bart/bin/dict.src.txt preprocess/english_wi_locness_with_syntax_bart/bin/dict.tgt.txt preprocess/english_wi_locness_with_syntax_bart/bin/preprocess.log preprocess/english_wi_locness_with_syntax_bart/bin/train.conll.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.conll.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/train.dpd.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.dpd.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/train.probs.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.probs.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.tgt.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.tgt.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.conll.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.conll.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.probs.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.probs.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.tgt.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.tgt.idx preprocess/english_wi_locness_with_syntax_transformer/ preprocess/english_wi_locness_with_syntax_transformer/bin/ preprocess/english_wi_locness_with_syntax_transformer/bin/dict.label.txt preprocess/english_wi_locness_with_syntax_transformer/bin/dict.src.txt preprocess/english_wi_locness_with_syntax_transformer/bin/dict.tgt.txt preprocess/english_wi_locness_with_syntax_transformer/bin/preprocess.log preprocess/english_wi_locness_with_syntax_transformer/bin/train.conll.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.conll.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/train.probs.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.probs.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.tgt.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.tgt.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.tgt.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.tgt.idx

and I run the bash file:

root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/data# cd /mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/ root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp# ls generate_syngec_bart.sh preprocess_syngec_bart.sh generate_syngec_transformer.sh preprocess_syngec_transformer.sh nohup.out train_syngec_bart.sh pipeline_gopar.sh train_syngec_transformer.sh root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp# ./pipeline_gopar.sh Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 17, in input_sentences = load(sys.argv[1]) File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 9, in load with open(filename, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt' Loading resources... Processing parallel files... Traceback (most recent call last): File "/opt/conda/bin/errant_parallel", line 8, in sys.exit(main()) File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in main in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor] File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor] FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/convert_gec_data_to_parsing_data_english.py", line 153, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt.conll_predict' /opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use-env is set by default in torchrun. If your script expects --local-rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions

warnings.warn( WARNING:torch.distributed.run:


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


/opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 34107) of binary: /opt/conda/bin/python Traceback (most recent call last): File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 196, in main() File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 192, in main launch(args) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 177, in launch run(args) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

supar.cmds.biaffine_dep FAILED

Failures: [1]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 1 (local_rank: 1) exitcode : 1 (pid: 34108) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [2]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 2 (local_rank: 2) exitcode : 1 (pid: 34109) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [3]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 3 (local_rank: 3) exitcode : 1 (pid: 34110) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [4]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 4 (local_rank: 4) exitcode : 1 (pid: 34111) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [5]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 5 (local_rank: 5) exitcode : 1 (pid: 34112) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [6]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 6 (local_rank: 6) exitcode : 1 (pid: 34113) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [7]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 7 (local_rank: 7) exitcode : 1 (pid: 34114) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 0 (local_rank: 0) exitcode : 1 (pid: 34107) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out'

what's the problem and how can I fix it?

HillZhang1999 commented 1 year ago

If you don't want to re-train the parser, you can directly skip the data preprocess step. The preprocessed file can be directly downloaded from our Google Drive. If you want to re-train the parser, you must download the required datasets from their official websites, and put them into the corresponding director (src.txt, tgt.txt, one sentence one line).

hwlys commented 8 months ago

您好,请问您解决了这个问题吗,我解压缩后也没有明确的src.txt和tgt.txt,解压缩的文件是这样的,请问该如何做呢

屏幕截图 2024-01-17 203001
HillZhang1999 commented 8 months ago

您好,请问您解决了这个问题吗,我解压缩后也没有明确的src.txt和tgt.txt,解压缩的文件是这样的,请问该如何做呢 屏幕截图 2024-01-17 203001

由于版权问题,我们没有提供文本文件,只有处理好的二进制文件,可以直接拿来训练