Open Aritra02091998 opened 1 day ago
I'm facing this error TypeError: _batch_encode_plus() got an unexpected keyword argument 'tokenize_newline_separately' while finetuning the paligemma, using the given notebook file.
Specifically while running the training using:
from transformers import TrainingArguments args=TrainingArguments( num_train_epochs=2, remove_unused_columns=False, per_device_train_batch_size=4, gradient_accumulation_steps=4, warmup_steps=2, learning_rate=2e-5, weight_decay=1e-6, adam_beta2=0.999, logging_steps=100, optim="adamw_hf", save_strategy="steps", save_steps=1000, save_total_limit=1, output_dir="paligemma_vqav2", bf16=True, report_to=["tensorboard"], dataloader_pin_memory=False )
from transformers import Trainer
trainer = Trainer( model=model, train_dataset=train_ds , data_collator=collate_fn, args=args )
trainer.train()
Using:
tokenizers 0.20.1 py39ha92566c_1 transformers 4.46.3 pyhd8ed1ab_0
I'm facing this error TypeError: _batch_encode_plus() got an unexpected keyword argument 'tokenize_newline_separately' while finetuning the paligemma, using the given notebook file.
Specifically while running the training using:
from transformers import TrainingArguments args=TrainingArguments( num_train_epochs=2, remove_unused_columns=False, per_device_train_batch_size=4, gradient_accumulation_steps=4, warmup_steps=2, learning_rate=2e-5, weight_decay=1e-6, adam_beta2=0.999, logging_steps=100, optim="adamw_hf", save_strategy="steps", save_steps=1000, save_total_limit=1, output_dir="paligemma_vqav2", bf16=True, report_to=["tensorboard"], dataloader_pin_memory=False )
from transformers import Trainer
trainer = Trainer( model=model, train_dataset=train_ds , data_collator=collate_fn, args=args )
trainer.train()