DeepLearnXMU / RevisedKey-knn-mt

Code for "Bridging the Domain Gaps in Context Representations for $k$-Nearest Neighbor Neural Machine Translation" (ACL 2023)
MIT License
9 stars 0 forks source link

Failed to run code #1

Closed sxysxy closed 1 year ago

sxysxy commented 1 year ago

Description

I first cloned your repo and download pretrained model and dataset, placed them as following folder structure:

image

And I followed instructions in readme.md to build base datastore, it reported error:

image
(RevisedKeyKNNMT) ➜  revisedkey-scripts git:(main) ✗ bash build_datastore.sh base koran
Traceback (most recent call last):
  File "/home/hfcloud/Program/RevisedKey-knn-mt/revisedkey-scripts/save_datastore.py", line 279, in <module>
    cli_main()
  File "/home/hfcloud/Program/RevisedKey-knn-mt/revisedkey-scripts/save_datastore.py", line 268, in cli_main
    parser = options.get_save_datastore_parser()
AttributeError: module 'fairseq.options' has no attribute 'get_save_datastore_parser'
2023-07-05 22:49:27 | INFO | fairseq_cli.validate | Namespace(dstore_mmap='..//datastores/koran_base', dstore_size=524400, dimension=1024, dstore_fp16=True, seed=1, ncentroids=4096, code_size=64, probe=32, faiss_index='..//datastores/koran_base/knn_index', num_keys_to_add_at_a_time=500000, starting_point=0, load_multiple_files=False, multiple_key_files=None, multiple_val_files=None, multiple_files_size=None, concat_file_path=None)
2023-07-05 22:49:27 | INFO | fairseq_cli.validate | load dstore fp16 524400 1024
Traceback (most recent call last):
  File "/home/hfcloud/Program/RevisedKey-knn-mt/revisedkey-scripts/train_datastore_gpu.py", line 89, in <module>
    keys = np.memmap(args.dstore_mmap + '/keys.npy', dtype=np.float16, mode='r',
  File "/home/HfCloud/.miniconda3/envs/RevisedKeyKNNMT/lib/python3.9/site-packages/numpy/core/memmap.py", line 225, in __new__
    f_ctx = open(os_fspath(filename), ('r' if mode == 'c' else mode)+'b')
FileNotFoundError: [Errno 2] No such file or directory: '..//datastores/koran_base/keys.npy'

Then I modified revisedkey-scripts/save_datastore.py, add sys.path.insert(2, "..")

image

Then I re-run bash build_datastore.sh base koran, it still reported error:

(RevisedKeyKNNMT) ➜  revisedkey-scripts git:(main) ✗ bash build_datastore.sh base koran
usage: save_datastore.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16]
                         [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW]
                         [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR]
                         [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE]
                         [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile]
                         [--criterion {masked_lm,label_smoothed_cross_entropy,wav2vec,sentence_ranking,legacy_masked_lm_loss,adaptive_loss,label_smoothed_cross_entropy_with_alignment,composite_loss,nat_loss,sentence_prediction,ctc,cross_entropy,vocab_parallel_cross_entropy}]
                         [--tokenizer {space,nltk,moses}] [--bpe {byte_bpe,gpt2,characters,bytes,subword_nmt,hf_byte_bpe,fastbpe,sentencepiece,bert}]
                         [--optimizer {adagrad,adamax,lamb,sgd,adam,adadelta,adafactor,nag}]
                         [--lr-scheduler {fixed,triangular,polynomial_decay,cosine,inverse_sqrt,reduce_lr_on_plateau,tri_stage}] [--scoring {wer,chrf,sacrebleu,bleu}] [--task TASK]
                         [--num-workers NUM_WORKERS] [--skip-invalid-size-inputs-valid-test] [--max-tokens MAX_TOKENS] [--batch-size BATCH_SIZE]
                         [--required-batch-size-multiple REQUIRED_BATCH_SIZE_MULTIPLE] [--required-seq-len-multiple REQUIRED_SEQ_LEN_MULTIPLE] [--dataset-impl DATASET_IMPL]
                         [--data-buffer-size DATA_BUFFER_SIZE] [--train-subset TRAIN_SUBSET] [--valid-subset VALID_SUBSET] [--validate-interval VALIDATE_INTERVAL]
                         [--validate-interval-updates VALIDATE_INTERVAL_UPDATES] [--validate-after-updates VALIDATE_AFTER_UPDATES] [--fixed-validation-seed FIXED_VALIDATION_SEED]
                         [--disable-validation] [--max-tokens-valid MAX_TOKENS_VALID] [--batch-size-valid BATCH_SIZE_VALID] [--curriculum CURRICULUM] [--gen-subset GEN_SUBSET]
                         [--num-shards NUM_SHARDS] [--shard-id SHARD_ID] [--distributed-world-size DISTRIBUTED_WORLD_SIZE] [--distributed-rank DISTRIBUTED_RANK]
                         [--distributed-backend DISTRIBUTED_BACKEND] [--distributed-init-method DISTRIBUTED_INIT_METHOD] [--distributed-port DISTRIBUTED_PORT] [--device-id DEVICE_ID]
                         [--distributed-no-spawn] [--ddp-backend {c10d,no_c10d}] [--bucket-cap-mb BUCKET_CAP_MB] [--fix-batches-to-gpus] [--find-unused-parameters] [--fast-stat-sync]
                         [--broadcast-buffers] [--distributed-wrapper {DDP,SlowMo}] [--slowmo-momentum SLOWMO_MOMENTUM] [--slowmo-algorithm SLOWMO_ALGORITHM]
                         [--localsgd-frequency LOCALSGD_FREQUENCY] [--nprocs-per-node NPROCS_PER_NODE] [--pipeline-model-parallel] [--pipeline-balance PIPELINE_BALANCE]
                         [--pipeline-devices PIPELINE_DEVICES] [--pipeline-chunks PIPELINE_CHUNKS] [--pipeline-encoder-balance PIPELINE_ENCODER_BALANCE]
                         [--pipeline-encoder-devices PIPELINE_ENCODER_DEVICES] [--pipeline-decoder-balance PIPELINE_DECODER_BALANCE] [--pipeline-decoder-devices PIPELINE_DECODER_DEVICES]
                         [--pipeline-checkpoint {always,never,except_last}] [--zero-sharding {none,os}] [--dstore-fp16] [--dstore-size N] [--dstore-mmap DSTORE_MMAP]
                         [--decoder-embed-dim N] [--multidomain-shuffle] [--use-knn-store] [--k K] [--knn-coefficient KNN_COEFFICIENT] [--faiss-metric-type FAISS_METRIC_TYPE]
                         [--knn-sim-func KNN_SIM_FUNC] [--knn-temperature KNN_TEMPERATURE] [--use-gpu-to-search] [--dstore-filename DSTORE_FILENAME] [--move-dstore-to-mem]
                         [--indexfile INDEXFILE] [--probe PROBE] [--no-load-keys] [--only-use-max-idx] [--save-plain-text] [--plain-text-file PLAIN_TEXT_FILE] [--retrieve-adapter]
                         [--use-retrieve-adapter USE_RETRIEVE_ADAPTER] [--adapter-ffn-scale ADAPTER_FFN_SCALE] [--lambda-type LAMBDA_TYPE] [--lambda-value LAMBDA_VALUE]
                         [--min-lambda-value MIN_LAMBDA_VALUE] [--max-lambda-value MAX_LAMBDA_VALUE] [--knn-step-bound KNN_STEP_BOUND] [--lambda-tend LAMBDA_TEND]
                         [--lambda-curve LAMBDA_CURVE] [--check-knn-result] [--path PATH] [--remove-bpe [REMOVE_BPE]] [--quiet] [--model-overrides MODEL_OVERRIDES]
                         [--results-path RESULTS_PATH]
save_datastore.py: error: argument --dataset-impl: invalid typing.Optional[fairseq.dataclass.utils.Choices] value: 'mmap'
2023-07-05 22:50:31 | INFO | fairseq_cli.validate | Namespace(dstore_mmap='..//datastores/koran_base', dstore_size=524400, dimension=1024, dstore_fp16=True, seed=1, ncentroids=4096, code_size=64, probe=32, faiss_index='..//datastores/koran_base/knn_index', num_keys_to_add_at_a_time=500000, starting_point=0, load_multiple_files=False, multiple_key_files=None, multiple_val_files=None, multiple_files_size=None, concat_file_path=None)
2023-07-05 22:50:31 | INFO | fairseq_cli.validate | load dstore fp16 524400 1024
Traceback (most recent call last):
  File "/home/hfcloud/Program/RevisedKey-knn-mt/revisedkey-scripts/train_datastore_gpu.py", line 89, in <module>
    keys = np.memmap(args.dstore_mmap + '/keys.npy', dtype=np.float16, mode='r',
  File "/home/HfCloud/.miniconda3/envs/RevisedKeyKNNMT/lib/python3.9/site-packages/numpy/core/memmap.py", line 225, in __new__
    f_ctx = open(os_fspath(filename), ('r' if mode == 'c' else mode)+'b')
FileNotFoundError: [Errno 2] No such file or directory: '..//datastores/koran_base/keys.npy'

It seems that the argument parser does not accept mmap for --dataset-impl ?

Unable to continue, what should I do to run your code? Thanks a lot for help!

czwlines commented 1 year ago

经过测试,在环境配置为 pytorch==1.12.0+cu116 时会出现这个错误,不仅仅是 “--dataset-impl” 参数无法被识别,其他参数也会出现该问题。而后,我发现在重新执行 pip install --editable . 时会出现编译错误。因此,我猜测这个问题与环境配置有关。

而后,我在 pytorch==1.7.1+cu110 下进行测试,发现可以成功构建数据库。 此外,关于 faiss-gpu,建议使用 conda install faiss-gpu cudatoolkit=11.0 -c pytorch 来进行安装

以下是我的环境配置:

# packages in environment at /home/lines/anaconda3/envs/knnmt:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
antlr4-python3-runtime    4.9.3                     <pip>
blas                      1.0                         mkl  
ca-certificates           2023.05.30           h06a4308_0  
cffi                      1.15.1                    <pip>
colorama                  0.4.6                     <pip>
cudatoolkit               11.0.221             h6bb024c_0  
Cython                    0.29.36                   <pip>
dataclasses               0.6                       <pip>
fairseq                   0.10.1                    <pip>
faiss-gpu                 1.7.1           py3.8_h293177f_1_cuda11.0    pytorch
hydra-core                1.3.2                     <pip>
importlib-resources       5.12.0                    <pip>
intel-openmp              2021.4.0          h06a4308_3561  
libedit                   3.1.20210910         h7f8727e_0  
libfaiss                  1.7.1           h7f34bec_1_cuda11.0    pytorch
libffi                    3.2.1             hf484d3e_1007  
libgcc-ng                 9.1.0                hdf63c60_0  
libstdcxx-ng              9.1.0                hdf63c60_0  
lxml                      4.9.3                     <pip>
mkl                       2021.4.0           h06a4308_640  
mkl-service               2.4.0            py38h7f8727e_0  
mkl_fft                   1.3.1            py38hd3c417c_0  
mkl_random                1.2.2            py38h51133e4_0  
ncurses                   6.3                  h7f8727e_2  
numpy                     1.22.3           py38he7a7128_0  
numpy                     1.24.4                    <pip>
numpy-base                1.22.3           py38hf524024_0  
omegaconf                 2.3.0                     <pip>
openssl                   1.1.1u               h7f8727e_0  
packaging                 23.1                      <pip>
Pillow                    10.0.0                    <pip>
pip                       23.1.2           py38h06a4308_0  
portalocker               2.7.0                     <pip>
pycparser                 2.21                      <pip>
python                    3.8.0                h0371630_2  
PyYAML                    6.0                       <pip>
readline                  7.0                  h7b6447c_5  
regex                     2023.6.3                  <pip>
sacrebleu                 2.3.1                     <pip>
setuptools                67.8.0           py38h06a4308_0  
six                       1.16.0             pyhd3eb1b0_1  
sqlite                    3.33.0               h62c20be_0  
tabulate                  0.9.0                     <pip>
tk                        8.6.12               h1ccaba5_0  
torch                     1.7.1+cu110               <pip>
torchvision               0.8.2+cu110               <pip>
tqdm                      4.65.0                    <pip>
typing_extensions         4.7.1                     <pip>
wheel                     0.38.4           py38h06a4308_0  
xz                        5.2.5                h7f8727e_1  
zipp                      3.15.0                    <pip>
zlib                      1.2.12               h7f8727e_2
czwlines commented 1 year ago

也参考 https://github.com/facebookresearch/fairseq/issues/4032 ,将 python 从 3.9 降到 3.7 。经过测试,在环境配置为 python3.7 + pytorch==1.11.0 + cu113 下,我可以成功构建数据库