Closed Rishita32 closed 7 months ago
Hi, thanks for raising an issue!
This is a question best placed in our forums. We try to reserve the github issues for feature requests and bug reports.
General comments:
add_eos_token
instructs the tokenizer to add an EOS token at the end of a sequence of tokens but don't control the length. max_new_tokens
. You can read the generate docs here and here. This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Hi, I am fine tuning Mistral7b model. I am getting long automated text generation using the fine tuned model. I have kept the eos_token=True. Can someone please tell me how to add a word limit to the responses?
This is the code for initializing tokenizer: base_model = "mistralai/Mistral-7B-v0.1" bnb_config = BitsAndBytesConfig( load_in_4bit= True, bnb_4bit_quant_type= "nf4", bnb_4bit_compute_dtype= torch.bfloat16, bnb_4bit_use_double_quant= False, ) model = AutoModelForCausalLM.from_pretrained( base_model, load_in_4bit=True, quantization_config=bnb_config, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) model.config.use_cache = False # silence the warnings. Please re-enable for inference! model.config.pretraining_tp = 1 model.gradient_checkpointing_enable()
Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True) tokenizer.padding_side = 'right' tokenizer.pad_token = tokenizer.unk_token tokenizer.add_eos_token = True tokenizer.max_length = 200 tokenizer.truncation = True
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
base_model = "mistralai/Mistral-7B-v0.1" bnb_config = BitsAndBytesConfig( load_in_4bit= True, bnb_4bit_quant_type= "nf4", bnb_4bit_compute_dtype= torch.bfloat16, bnb_4bit_use_double_quant= False, ) model = AutoModelForCausalLM.from_pretrained( base_model, load_in_4bit=True, quantization_config=bnb_config, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) model.config.use_cache = False # silence the warnings. Please re-enable for inference! model.config.pretraining_tp = 1 model.gradient_checkpointing_enable()
Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True) tokenizer.padding_side = 'right' tokenizer.pad_token = tokenizer.unk_token tokenizer.add_eos_token = True tokenizer.max_length = 200 tokenizer.truncation = True
Expected behavior
Looking for a solution to avoid long text generation.