huggingface / trl

Train transformer language models with reinforcement learning.
http://hf.co/docs/trl
Apache License 2.0
8.61k stars 1.06k forks source link

SFTTrainer device error even though it doesn't take device as an argument #1767

Open zyzhang1130 opened 1 week ago

zyzhang1130 commented 1 week ago

As the title suggests. I was trying to fine-tune gpt2 model, the pipeline should be pretty standard (https://github.com/zyzhang1130/agentscope/blob/conversation_with_agent_with_finetuned_model/examples/conversation_with_agent_with_finetuned_model/huggingface_model.py). Got the following error:

Exception has occurred: TypeError
device() received an invalid combination of arguments - got (NoneType), but expected one of:
 * (torch.device device)
      didn't match because some of the arguments have invalid types: (NoneType)
 * (str type, int index)
  File "/home/zy1130/agentscope/examples/conversation_with_agent_with_finetuned_model/huggingface_model.py", line 553, in _fine_tune_training
    trainer.train()
  File "/home/zy1130/agentscope/examples/conversation_with_agent_with_finetuned_model/huggingface_model.py", line 110, in __init__
    self.model = self._fine_tune_training(
  File "/home/zy1130/agentscope/src/agentscope/models/__init__.py", line 117, in load_model_by_config_name
    return _get_model_wrapper(model_type=model_type)(**kwargs)
  File "/home/zy1130/agentscope/src/agentscope/agents/agent.py", line 195, in __init__
    self.model = load_model_by_config_name(model_config_name)
  File "/home/zy1130/agentscope/src/agentscope/agents/dialog_agent.py", line 45, in __init__
    super().__init__(
  File "/home/zy1130/agentscope/examples/conversation_with_agent_with_finetuned_model/FinetuneDialogAgent.py", line 46, in __init__
    super().__init__(
  File "/home/zy1130/agentscope/src/agentscope/agents/agent.py", line 82, in __call__
    instance = super().__call__(*args, **kwargs)
  File "/home/zy1130/agentscope/examples/conversation_with_agent_with_finetuned_model/conversation_with_agent_with_finetuned_model.py", line 78, in main
    dialog_agent = FinetuneDialogAgent(
  File "/home/zy1130/agentscope/examples/conversation_with_agent_with_finetuned_model/conversation_with_agent_with_finetuned_model.py", line 123, in <module>
    main()
TypeError: device() received an invalid combination of arguments - got (NoneType), but expected one of:
 * (torch.device device)
      didn't match because some of the arguments have invalid types: (NoneType)
 * (str type, int index)

It is strange, as SFTTrainer does not have device as its argument, so there seems to be nothing I can do about it..