openai / weak-to-strong

MIT License
2.48k stars 300 forks source link

Unexpected keyword argument 'bf16' #11

Closed agokrani closed 7 months ago

agokrani commented 8 months ago

Hi,

I am trying to reproduce the setup on T4 google colab and getting following error:

Traceback (most recent call last): File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 356, in fire.Fire(main) File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, kwargs) File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 272, in main weak_test_results, weak_ds = train_model( File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 250, in train_model return train_and_save_model( File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/train.py", line 229, in train_and_save_model model = TransformerWithHead.from_pretrained( File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/model.py", line 34, in from_pretrained return cls(name, kwargs) File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/model.py", line 22, in init lm = AutoModelForCausalLM.from_pretrained(name, *kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3450, in from_pretrained model = cls(config, model_args, **model_kwargs) TypeError: GPT2LMHeadModel.init() got an unexpected keyword argument 'bf16'

Do why this might be the case?

BoilerToad commented 8 months ago

I am getting the same. I am not certain if this is because I am using an Apple M3 (using python 3.11). I did have to work around torch.cuda... but then ran into the 'bf16' issue.

srivhash commented 8 months ago

bf16 is a data type that is only supported on high end gpus like A100. similarly fp32 may not be supported. Simple fix: make the kwargs parameter inside the model definition empty by commenting out the bf16 and fp32 stuff and it should work. bf16 would be required for the qwen models but the given code does not run the qwen model, hence the code will work.

srivhash commented 8 months ago

T4 gpu won't be able to run the bf16 data type

agokrani commented 8 months ago

Hi @srivhash,

The solution worked for me.

Thanks!

srivhash commented 8 months ago

Nice !!

WuTheFWasThat commented 8 months ago

PR is welcome, would be good to have custom_kwargs avoid passing invalid flags :)

agokrani commented 8 months ago

just created a PR by adding bf16 flag which defaults to False.

zky001 commented 8 months ago

1703128656114

zky001 commented 8 months ago

image seem not support

srivhash commented 8 months ago

Would that not be a sub optimal fix by removing parameters from kwargs

srivhash commented 8 months ago

Instead I created a custom function that would check if the parameters within kwargs are valid and ignore the type error by removing the enlisted parameter from model initialization recursively. I think this would be a better practice, and hence created a PR.

Howecer, I do understand the above solution is easier, upto you @WuTheFWasThat

zhxieml commented 8 months ago

It seems like the bf16 and fp32 arguments are for TrainingArguments, not from_pretrained. Replacing the original custom_kwargs items with "torch_dtype": torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float32 works in my case.

srivhash commented 8 months ago

@fffffarmer can you create a pull request including your changes. Seems more promising than the other solutions here.

WuTheFWasThat commented 7 months ago

thanks @fffffarmer, merged!