neulab / knn-transformers

PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an implementation of kNN-LM and kNN-MT
MIT License
271 stars 22 forks source link

saving-a-datastore-for-knn-mt section in README is missing a proper dstore_size parameter #8

Closed Maxwell-Lyu closed 2 years ago

Maxwell-Lyu commented 2 years ago

saving-a-datastore-for-knn-mt section in README is missing a proper dstore_size parameter.

to add

--dstore_size 26565876 \

current version

MODEL=t5-small

python -u run_translation.py  \
  --model_name_or_path ${MODEL} \
  --dataset_name wmt16 --dataset_config_name ro-en \
  --per_device_train_batch_size 4 --per_device_eval_batch_size=4 \
  --output_dir checkpoints-translation/${MODEL} \
  --source_lang en --target_lang ro \
  --dstore_dir checkpoints-translation/${MODEL} \
   --save_knnlm_dstore --do_eval --eval_subset train \
   --source_prefix "translate English to Romanian: "

correct version (maybe)

MODEL=t5-small

python -u run_translation.py  \
  --model_name_or_path ${MODEL} \
  --dataset_name wmt16 --dataset_config_name ro-en \
  --per_device_train_batch_size 4 --per_device_eval_batch_size=4 \
  --output_dir checkpoints-translation/${MODEL} \
  --source_lang en --target_lang ro \
  --dstore_dir checkpoints-translation/${MODEL} \
  --save_knnlm_dstore --do_eval --eval_subset train \
  --dstore_size 26565876 \
  --source_prefix "translate English to Romanian: "
urialon commented 2 years ago

Thanks @Maxwell-Lyu ! Fixed! Let me know if you have any problems or questions.

Maxwell-Lyu commented 2 years ago

Thanks! Closing this issue now. Glad I can help~

vhientran commented 1 year ago

Sorry to disturb you. I run kNN-MT with default hyper-prams, but it got the error: OverflowError: out of range integral type conversion attempted . It seems the default max_length hyper-pram for decoding process is small or another reason. How can I fix this error? Thank you!

urialon commented 1 year ago

Hi @vhientran , Thank you for your interest in our work!

This is a closed issue, can you please open a new one, and provide the details of what you ran exactly, what is the full error and stack trace?

Thanks, Uri

vhientran commented 1 year ago

Hi @Maxwell-Lyu and @urialon , Sorry for disturbing you. I only wonder how we can set or select the value of the hyper-parameter dstore_size effectively and suitably?

urialon commented 1 year ago

Hi @vhientran ,

dstore_size is the total number of tokens in your training set. You can find this number by starting an evaluation of any model on the train split,

Using the command line here:

https://github.com/neulab/knn-transformers#step-1-evaluating-the-base-language-model

But using --eval_subset train instead of validation.

And the number of tokens will be printed by this line:

https://github.com/neulab/knn-transformers/blob/master/run_clm.py#L538

After they are printed, you can stop the run and feed this parameter to a new run.

Best, Uri

vhientran commented 1 year ago

Hi @urialon Thank you so much for your detail explanation. It helps me a lot. I will follow your guidance to find dstore_size in my model. Many thanks!