microsoft / BioGPT

MIT License
4.3k stars 452 forks source link

[QA-PubMedQA] preprocess.sh : Can't open perl script "/scripts/tokenizer/tokenizer.perl": No such file or directory #92

Closed ZON-ZONG-MIN closed 1 year ago

ZON-ZONG-MIN commented 1 year ago

I am trying BioGPT/examples/QA-PubMedQA/README.md using googleColab

When I run bash preprocess.sh # for BioGPT I get these messages

450 samples in ../../data/PubMedQA/raw/train_set.json has been processed
50 samples in ../../data/PubMedQA/raw/dev_set.json has been processed
500 samples in ../../data/PubMedQA/raw/test_set.json has been processed
Preprocessing train
Can't open perl script "/scripts/tokenizer/tokenizer.perl": No such file or directory
Can't open perl script "/scripts/tokenizer/tokenizer.perl": No such file or directory
preprocess.sh: line 27: /fast: No such file or directory
preprocess.sh: line 28: /fast: No such file or directory
Preprocessing valid
Can't open perl script "/scripts/tokenizer/tokenizer.perl": No such file or directory
Can't open perl script "/scripts/tokenizer/tokenizer.perl": No such file or directory
preprocess.sh: line 27: /fast: No such file or directory
preprocess.sh: line 28: /fast: No such file or directory
Preprocessing test
Can't open perl script "/scripts/tokenizer/tokenizer.perl": No such file or directory
Can't open perl script "/scripts/tokenizer/tokenizer.perl": No such file or directory
preprocess.sh: line 27: /fast: No such file or directory
preprocess.sh: line 28: /fast: No such file or directory
2023-04-08 20:16:21.389129: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-08 20:16:22.985061: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-04-08 20:16:27 | INFO | fairseq_cli.preprocess | Namespace(no_progress_bar=False, log_interval=100, log_format=None, 
.................

Can't open perl script "/scripts/tokenizer/tokenizer.perl": No such file or directory I found /scripts/tokenizer/tokenizer.perl in BioGPT/mosesdecoder But what should I do? :hear_no_evil:

rajkumar-surana commented 1 year ago

have you set up the path of "mosesdecoder" by running export as in github home page of this repo? git clone https://github.com/moses-smt/mosesdecoder.git export MOSES=${PWD}/mosesdecoder

lir0ni commented 1 year ago

But how should I solve the /fast error? I still receive it after fixing the scripts error. Thanks!