ValueError: Must specify max_steps > 0, given: 0

etetteh commented 3 years ago

$python3 electra_small/run_finetuning.py \
--data-dir $DATA_DIR \
--model-name "ELECTRA-small" \
--hparams '{"model_size": "small", "task_names": ["<task_name>"], "num_trials": 5, "learning_rate": 3e-4, "train_batch_size": 16, "use_tpu": "True", "num_tpu_cores": 8, "tpu_name": "<tpu_name>", "tpu_zone": "europe-west4-a", "gcp_project": "<gcp_name>", "vocab_size": 50000, "num_train_epochs": 10}'

I am getting the following error. Is there something I am missing?

Training for 0 steps
ERROR:tensorflow:Error recorded from training_loop: Must specify max_steps > 0, given: 0
Traceback (most recent call last):
  File "electra_small/run_finetuning.py", line 323, in <module>
    main()
  File "electra_small/run_finetuning.py", line 319, in main
    args.model_name, args.data_dir, **hparams))
  File "electra_small/run_finetuning.py", line 270, in run_finetuning
    model_runner.train()
  File "electra_small/run_finetuning.py", line 183, in train
    input_fn=self._train_input_fn, max_steps=self.train_steps)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3035, in train
    rendezvous.raise_errors()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 136, in raise_errors
    six.reraise(typ, value, traceback)
  File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
    raise value
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
    saving_listeners=saving_listeners)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
    'Must specify max_steps > 0, given: {}'.format(max_steps))
ValueError: Must specify max_steps > 0, given: 0

etetteh commented 3 years ago

Can anyone help. Still getting the same error

:~$ python3 electra_small/run_finetuning.py --data-dir "gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/" --model-name "covidECTRA-small" --hparams '{"model_size": "small", "task_names": ["bc5c"], "num_trials": 5,"learning_rate": 3e-4, "train_batch_size": 16, "use_tpu": "True", "num_tpu_cores": 8, "tpu_name": "my_tpu", "tpu_zone": "europe-west4-a", "gcp_project": "my_p", "vocab_size": 50000, "num_train_epochs": 60, "model_dir":"gs://my_bucket/pretraining_data/models/covidECTRA-Small/", "vocab_file":"gs://my_bucket/pretraining_data/vocab.txt"}'

================================Config: model=ELECTRA-small, trial 1/5==============================
answerable_classifier True
answerable_uses_start_logits True
answerable_weight 0.5
beam_size 20
data_dir gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/
debug False
do_eval True
do_lower_case True
do_train True
doc_stride 128
double_unordered True
embedding_size None
eval_batch_size 32
gcp_project covidectra
init_checkpoint gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/ELECTRA-small
iterations_per_loop 1000
joint_prediction True
keep_all_models True
layerwise_lr_decay 0.8
learning_rate 0.0003
log_examples False
max_answer_length 30
max_query_length 64
max_seq_length 128
model_dir gs://my_bucket/pretraining_data/models/ELECTRA-Small/
model_hparam_overrides {}
model_name ELECTRA-small
model_size small
n_best_size 20
n_writes_test 5
num_tpu_cores 8
num_train_epochs 60
num_trials 5
predict_batch_size 32
preprocessed_data_dir gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/covidECTRA-small/finetuning_tfrecords/bc5c_tfrecords
qa_eval_file <built-in method format of str object at 0x7fa57ed85030>
qa_na_file <built-in method format of str object at 0x7fa57edf03a0>
qa_na_threshold -2.75
qa_preds_file <built-in method format of str object at 0x7fa57ed850d8>
raw_data_dir <built-in method format of str object at 0x7fa57dba8e00>
results_pkl gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/ELECTRA-small/results/bc5c_results.pkl
results_txt gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/ELECTRA-small/results/bc5c_results.txt
save_checkpoints_steps 1000000
task_names ['bc5c']
test_predictions <built-in method format of str object at 0x7fa57dba41a0>
tpu_job_name None
tpu_name my_tpu
tpu_zone europe-west4-a
train_batch_size 16
use_tfrecords_if_existing True
use_tpu True
vocab_file gs://my_bucket/pretraining_data/vocab.txt
vocab_size 50000
warmup_proportion 0.1
weight_decay_rate 0.01
write_test_outputs True

Loading dataset bc5c_train
================================================================================
Start training: model=ELECTRA-small, trial 1/5
================================================================================
Training for 0 steps
ERROR:tensorflow:Error recorded from training_loop: Must specify max_steps > 0, given: 0
Traceback (most recent call last):
  File "electra_small/run_finetuning.py", line 323, in <module>
    main()
  File "electra_small/run_finetuning.py", line 319, in main
    args.model_name, args.data_dir, **hparams))
  File "electra_small/run_finetuning.py", line 270, in run_finetuning
    model_runner.train()
  File "electra_small/run_finetuning.py", line 183, in train
    input_fn=self._train_input_fn, max_steps=self.train_steps)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3035, in train
    rendezvous.raise_errors()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 136, in raise_errors
    six.reraise(typ, value, traceback)
  File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
    raise value
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
    saving_listeners=saving_listeners)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
    'Must specify max_steps > 0, given: {}'.format(max_steps))
ValueError: Must specify max_steps > 0, given: 0

When I check the GCS: gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/ELECTRA-small/finetuning_tfrecords/bc5c_tfrecords

I found out the tfrecord file is not created. It's file size is 0 B

bean9610 commented 3 years ago

I had the same problem, and my error was because the dataset did not have a separator, such as adding a blank line after a sentence of text. for example:

海 O 域 O 。 O

这 O 座 O

etetteh commented 3 years ago

Okay. Thank you. My mistake was that I was passing a different sequence length. Now it works fine. But fine-tuning is not working. For some reason, I get zero for all metrics

xiangrukui commented 1 year ago

Excuse me, have you solved this problem? Thank you.

xiangrukui commented 1 year ago

How is this problem solved?Thank you. ValueError: Must specify max_steps > 0, given: 0

google-research / electra

ValueError: Must specify max_steps > 0, given: 0 #104