Closed Leonard907 closed 1 year ago
Hi @Leonard907 , Thank you for your interest in our work!
Yes, to reproduce these experiments, follow this section of the README.
Specifically, you should take the main command line:
python src/run.py \
src/configs/model/bart_base_sled.json
src/configs/training/base_training_args.json \
src/configs/data/gov_report.json \
--output_dir output_train_bart_base_local/ \
--learning_rate 1e-5 \
--model_name_or_path facebook/bart-base \
--max_source_length 1024 \
--eval_max_source_length 1024 --do_eval=True \
--eval_steps 1000 --save_steps 1000 \
--per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
--extra_metrics bertscore
and add --test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport
.
Let us know if you have any issues or questions!
Best, Uri
Thank you very much!
Hi @Leonard907 , Thank you for your interest in our work!
Yes, to reproduce these experiments, follow this section of the README.
Specifically, you should take the main command line:
python src/run.py \ src/configs/model/bart_base_sled.json src/configs/training/base_training_args.json \ src/configs/data/gov_report.json \ --output_dir output_train_bart_base_local/ \ --learning_rate 1e-5 \ --model_name_or_path facebook/bart-base \ --max_source_length 1024 \ --eval_max_source_length 1024 --do_eval=True \ --eval_steps 1000 --save_steps 1000 \ --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \ --extra_metrics bertscore
and add
--test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport
. Let us know if you have any issues or questions!Best, Uri
Hello, I take the main command line you listed, but get "srcIndex < srcSelectDimSize" issue, when I delete "--eval_max_source_length 999999", the issue is addressed, what should I do while using this command?
Hi @Leonard907 , Thank you for your interest in our work! Yes, to reproduce these experiments, follow this section of the README. Specifically, you should take the main command line:
python src/run.py \ src/configs/model/bart_base_sled.json src/configs/training/base_training_args.json \ src/configs/data/gov_report.json \ --output_dir output_train_bart_base_local/ \ --learning_rate 1e-5 \ --model_name_or_path facebook/bart-base \ --max_source_length 1024 \ --eval_max_source_length 1024 --do_eval=True \ --eval_steps 1000 --save_steps 1000 \ --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \ --extra_metrics bertscore
and add
--test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport
. Let us know if you have any issues or questions! Best, UriHello, I take the main command line you listed, but get "srcIndex < srcSelectDimSize" issue, when I delete "--eval_max_source_length 999999", the issue is addressed, what should I do while using this command?
is it necessary to set "use_ Datastore=True" ?
Hi @Leonard907 ,
It works for me,
the only thing that was missing was adding --tokenizer_name facebook/bart-base
, but we will add the tokenizer to the model so it won't be needed in the future.
Setting --use_datastore
is useful with extremely long inputs, but it should work either way.
Can you try to (1) git pull
the latest version, and (2) run the exact following command line (test only, no training):
python src/run.py \
src/configs/model/bart_base_sled.json \
src/configs/training/base_training_args.json \
src/configs/data/gov_report.json \
--output_dir output_train_bart_base_local/ \
--learning_rate 1e-5 \
--model_name_or_path facebook/bart-base \
--max_source_length 1024 \
--eval_max_source_length 999999 --do_eval=True --do_train=False \
--eval_steps 1000 --save_steps 1000 \
--per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
--extra_metrics bertscore --test_unlimiformer \
--model_name_or_path abertsch/bart-base-govreport \
--tokenizer facebook/bart-base
?
Following up on the above: tokenizer is now be added to the model! It should now run without explicitly setting tokenizer_name
Closing due to inactivity, feel free to re-open or create a new issue if you have any questions or problems.
Hi I want to reproduce the results of the +test Unlimiformer from the paper. Based on my understanding this setup does not require training, so is it possible to load an available checkpoint (like this) and convert it to Unlimiformer like the example demonstrated in inference-example.py? Are there any settings that I omitted here? Thanks!