Open kontramind opened 9 months ago
Hi,
Could you please provide your training hyperparameters or whole python code?
Hi,
Could you please provide your training hyperparameters or whole python code?
Hi @unnir ,
Sure. Here is the code. We run training on California dataset. Keep in mind that we also introduce a workaround for BelenGarciaPascual' question. Belen and me are collaborating on same task. We are planning to work on a proper PR.
In the code below total number of epoch is 8*9.
```python
batch_size = 32
steps = len(data)//batch_size
epochs = [0,1,2,3,4,5,6,7]
columns = data.columns
for epoch in epochs:
for idx, column in enumerate(columns):
print(f'{epoch=} -> {column=}')
great = GReaT(base, # Name of the large language model used (see HuggingFace for more options)
batch_size=batch_size,
epochs=epoch*len(data.columns) + idx + 1, # Number of epochs to train (only one epoch for demonstration)
save_steps=steps, # Save model weights every x steps
logging_steps=steps, # Log the loss and learning rate every x steps
experiment_dir=f"aleks_{llm}_trainer", # Name of the directory where all intermediate steps are saved
)
if epoch == 0 and idx == 0:
trainer = great.fit(data, conditional_col=column)
else:
trainer = great.fit(data, conditional_col=column, resume_from_checkpoint=True)
rmtree(Path(f"aleks_{llm}_trainer")/f"checkpoint-{epoch*len(data.columns)*steps + idx*steps}")
great.save(f"aleks_california_{llm}")
for path in Path(f"aleks_{llm}_trainer").iterdir():
if path.is_dir():
print(f'{path=}')
My suggestion, again, is to train the model longer, but I will try to reproduce the error and debug it.
Hi,
I'm trying to use f.eks, 'sshleifer/distilbart-cnn-6-6' and failing. Following message:
An error has occurred: Breaking the generation loop! To address this issue, consider fine-tuning the GReaT model for an longer period. This can be achieved by increasing the number of epochs. Alternatively, you might consider increasing the max_length parameter within the sample function. For example: model.sample(n_samples=10, max_length=2000) If the problem persists despite these adjustments, feel free to raise an issue on our GitHub page at: https://github.com/kathrinse/be_great/issues
Aleksandar