Closed oonisim closed 1 year ago
That's because the dataset you are using does not have a length, so the Trainer sets the number of epochs to a very high number to make sure it does the number of steps you are asking for.
@sgugger , thanks for the explanation.
May I suggest updating the document adding the Trainer behavior and requirements for streaming dataset e.g. to use max_steps and what value to set. Otherwise users may keep raising questions on max_steps (there have been at least 3 questions in forum) and epochs?
I am afraid otherwise you may need to spend your time for each time we raise it.
Currently Datasets - Stream and Trainer documents have no such information as far as I looked at (please correct if there is).
We welcome any PR making the documentation better :-)
System Info
System Info
Running on SageMaker Studio g4dn 2xlarge.
Background
Fine tune BLOOM model for summarization.
input_ids
set to tokenized text andlabels
set to tokenized summary.Problem
When using the streaming huggingface dataset, Trainer API shows huge
Num Epochs = 9,223,372,036,854,775,807
.The
TrainingArguments
used:When not using streaming
DATASET_STREAMING=False
as in the code, theNum Epochs
is displayed as expected.Who can help?
trainer: @sgugger
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run the code below
Expected behavior
Get the intended epochs 3 or explanation of the Num Epochs (9223372036854775807).
When not using streaming
DATASET_STREAMING=False
as in the code, theNum Epochs
is displayed as expected.Related
TrainingArguments class - max_steps formula when using streaming dataset
Streaming Dataset of Sequence Length 2048