Closed kunling-cxk closed 2 weeks ago
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hey! Sorry for the delay; should this be addressed to the TRL or Axolotl library instead? There is no SFTDataCollator
in transformers.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
transformers
version: 4.43.3Who can help?
model = prepare_peft_model(model, model_args.peft_mode) loss_func = TargetLMLoss(ignore_index=tokenizer.pad_token_id)
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
class UnifiedSFTDataset(Dataset): """ 统一的数据处理dataset """ def init(self, file, tokenizer, max_seq_length, template): self.tokenizer = tokenizer self.template_name = template.template_name #模板名字 self.system_format = template.system_format #模板系统格式 self.user_format = template.user_format #模板用户格式 self.assistant_format = template.assistant_format #模板助手格式 self.system = template.system #系统可能会有一些行为
class Trainer(transformers.Trainer): """ 主要修改逻辑:通过传入compute_loss,支持自定义loss计算方式 """ def init( self, model: Union[PreTrainedModel, nn.Module] = None, args: TrainingArguments = None, data_collator: Optional[DataCollator] = None, train_dataset: Optional[Dataset] = None, eval_dataset: Optional[Dataset] = None, tokenizer: Optional[PreTrainedTokenizerBase] = None, model_init: Callable[[], PreTrainedModel] = None, compute_metrics: Optional[Callable[[EvalPrediction], Dict]] = None, callbacks: Optional[List[TrainerCallback]] = None, optimizers: Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LambdaLR] = (None, None), preprocess_logits_for_metrics: Callable[[torch.Tensor, torch.Tensor], torch.Tensor] = None, compute_loss=None, remove_unused_columns:Optional[bool] = False, ): super(Trainer, self).init( model=model, args=args, data_collator=data_collator, train_dataset=train_dataset, eval_dataset=eval_dataset, tokenizer=tokenizer, model_init=model_init, compute_metrics=compute_metrics, callbacks=callbacks, optimizers=optimizers, preprocess_logits_for_metrics=preprocess_logits_for_metrics, ) self.loss_func = compute_loss print("trainer:",train_dataset[0].keys())
class LoRATrainer(Trainer): """ 修改checkkpoint的保存逻辑,只保存lora """ def _save(self, output_dir: Optional[str] = None, state_dict=None):
If we are executing this function, we are the process zero, so we don't check for that.
Expected behavior
i'm sure the dataset was has 3 keys('target_mask,'input_ids','attention_mask') but when call in SFTDataCollator,the train_dataset was only 2 keys,missing 'target_mask'