stanford-crfm / BioMedLM

589 stars 60 forks source link

Running generation batch misses file #21

Open PelzKo opened 1 year ago

PelzKo commented 1 year ago

The run_generation_batch.py in finetune/textgen/gpt2 imports stuff from train_control. Now that is not an open package, but another file from https://github.com/XiangLi1999/PrefixTuning/tree/cleaned/gpt2 which you need to add to the folder

PelzKo commented 1 year ago

Also in the same file I have not been able to figure out what from utils import calculate_rouge, chunks, parse_numeric_n_bool_cl_kwargs, use_task_specific_params refers to, thus I had to just comment it out

J38 commented 1 year ago

Could you give me some more details about what you're trying to do? I am planning on pushing an updated version of this code with clear examples for training and generating responses for the MeQSum task (which is a good demo task for prompt --> response) and can quickly be adapted to other tasks.

PelzKo commented 1 year ago

I am trying to extract a diagnoses of an image from the Title, referencing paragraph and image caption of a scientific paper (so in a broader sense this is also a summarization problem). For that I have been using your finetune and evaluation script to run it: Training: torchrun --nproc_per_node=1 --nnodes=1 --node_rank=0 finetune_for_summarization.py --output_dir out --model_name_or_path stanford-crfm/BioMedLM --tokenizer_name stanford-crfm/pubmed_gpt_tokenizer --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --save_strategy no --do_eval --train_data_file ~/data/t2i/data/llm/without_nan/train.source --eval_data_file ~/data/t2i/data/llm/without_nan/val.source --save_total_limit 2 --overwrite_output_dir --gradient_accumulation_steps 1 --learning_rate 1.6e-4 --warmup_ratio 0.5 --weight_decay 0.0 --seed 11 --evaluation_strategy steps --eval_steps 200 --bf16 --num_train_epochs 10 --logging_steps 100 --logging_first_step Evaluation: CUDA_VISIBLE_DEVICES=0 python -u run_generation_batch.py --fp16 --max_source_length -1 --length 400 --model_name_or_path=out --num_return_sequences 5 --stop_token [SEP] --tokenizer_name=stanford-crfm/pubmed_gpt_tokenizer --task_mode=meqsum --control_mode=no --tuning_mode finetune --gen_dir generated_results --batch_size 9 --temperature 1.0 --no_repeat_ngram_size 6 --length_penalty -0.5 --wandb_entity=None --wandb_project=None --wandb_run_name=None

PelzKo commented 1 year ago

For it to run I had to copy the script I referred to earlier and remove the import line I specified