google-research / electra

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Apache License 2.0
2.31k stars 351 forks source link

ValueError: Must specify max_steps > 0, given: 0 #104

Closed etetteh closed 3 years ago

etetteh commented 3 years ago
$python3 electra_small/ \
--data-dir $DATA_DIR \
--model-name "ELECTRA-small" \
--hparams '{"model_size": "small", "task_names": ["<task_name>"], "num_trials": 5, "learning_rate": 3e-4, "train_batch_size": 16, "use_tpu": "True", "num_tpu_cores": 8, "tpu_name": "<tpu_name>", "tpu_zone": "europe-west4-a", "gcp_project": "<gcp_name>", "vocab_size": 50000, "num_train_epochs": 10}'

I am getting the following error. Is there something I am missing?

Training for 0 steps
ERROR:tensorflow:Error recorded from training_loop: Must specify max_steps > 0, given: 0
Traceback (most recent call last):
  File "electra_small/", line 323, in <module>
  File "electra_small/", line 319, in main
    args.model_name, args.data_dir, **hparams))
  File "electra_small/", line 270, in run_finetuning
  File "electra_small/", line 183, in train
    input_fn=self._train_input_fn, max_steps=self.train_steps)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/", line 3035, in train
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/", line 136, in raise_errors
    six.reraise(typ, value, traceback)
  File "/usr/local/lib/python3.5/dist-packages/", line 703, in reraise
    raise value
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/", line 3030, in train
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/", line 358, in train
    'Must specify max_steps > 0, given: {}'.format(max_steps))
ValueError: Must specify max_steps > 0, given: 0
etetteh commented 3 years ago

Can anyone help. Still getting the same error

:~$ python3 electra_small/ --data-dir "gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/" --model-name "covidECTRA-small" --hparams '{"model_size": "small", "task_names": ["bc5c"], "num_trials": 5,"learning_rate": 3e-4, "train_batch_size": 16, "use_tpu": "True", "num_tpu_cores": 8, "tpu_name": "my_tpu", "tpu_zone": "europe-west4-a", "gcp_project": "my_p", "vocab_size": 50000, "num_train_epochs": 60, "model_dir":"gs://my_bucket/pretraining_data/models/covidECTRA-Small/", "vocab_file":"gs://my_bucket/pretraining_data/vocab.txt"}' 
================================Config: model=ELECTRA-small, trial 1/5==============================
answerable_classifier True
answerable_uses_start_logits True
answerable_weight 0.5
beam_size 20
data_dir gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/
debug False
do_eval True
do_lower_case True
do_train True
doc_stride 128
double_unordered True
embedding_size None
eval_batch_size 32
gcp_project covidectra
init_checkpoint gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/ELECTRA-small
iterations_per_loop 1000
joint_prediction True
keep_all_models True
layerwise_lr_decay 0.8
learning_rate 0.0003
log_examples False
max_answer_length 30
max_query_length 64
max_seq_length 128
model_dir gs://my_bucket/pretraining_data/models/ELECTRA-Small/
model_hparam_overrides {}
model_name ELECTRA-small
model_size small
n_best_size 20
n_writes_test 5
num_tpu_cores 8
num_train_epochs 60
num_trials 5
predict_batch_size 32
preprocessed_data_dir gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/covidECTRA-small/finetuning_tfrecords/bc5c_tfrecords
qa_eval_file <built-in method format of str object at 0x7fa57ed85030>
qa_na_file <built-in method format of str object at 0x7fa57edf03a0>
qa_na_threshold -2.75
qa_preds_file <built-in method format of str object at 0x7fa57ed850d8>
raw_data_dir <built-in method format of str object at 0x7fa57dba8e00>
results_pkl gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/ELECTRA-small/results/bc5c_results.pkl
results_txt gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/ELECTRA-small/results/bc5c_results.txt
save_checkpoints_steps 1000000
task_names ['bc5c']
test_predictions <built-in method format of str object at 0x7fa57dba41a0>
tpu_job_name None
tpu_name my_tpu
tpu_zone europe-west4-a
train_batch_size 16
use_tfrecords_if_existing True
use_tpu True
vocab_file gs://my_bucket/pretraining_data/vocab.txt
vocab_size 50000
warmup_proportion 0.1
weight_decay_rate 0.01
write_test_outputs True

Loading dataset bc5c_train
Start training: model=ELECTRA-small, trial 1/5
Training for 0 steps
ERROR:tensorflow:Error recorded from training_loop: Must specify max_steps > 0, given: 0
Traceback (most recent call last):
  File "electra_small/", line 323, in <module>
  File "electra_small/", line 319, in main
    args.model_name, args.data_dir, **hparams))
  File "electra_small/", line 270, in run_finetuning
  File "electra_small/", line 183, in train
    input_fn=self._train_input_fn, max_steps=self.train_steps)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/", line 3035, in train
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/", line 136, in raise_errors
    six.reraise(typ, value, traceback)
  File "/usr/local/lib/python3.5/dist-packages/", line 703, in reraise
    raise value
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/tpu/", line 3030, in train
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/", line 358, in train
    'Must specify max_steps > 0, given: {}'.format(max_steps))
ValueError: Must specify max_steps > 0, given: 0

When I check the GCS: gs://my_bucket/pretraining_data/finetuning_data/BC5CDR-chem/models/ELECTRA-small/finetuning_tfrecords/bc5c_tfrecords

I found out the tfrecord file is not created. It's file size is 0 B

bean9610 commented 3 years ago

I had the same problem, and my error was because the dataset did not have a separator, such as adding a blank line after a sentence of text. for example:

海 O 域 O 。 O

这 O 座 O

etetteh commented 3 years ago

Okay. Thank you. My mistake was that I was passing a different sequence length. Now it works fine. But fine-tuning is not working. For some reason, I get zero for all metrics

xiangrukui commented 1 year ago

Excuse me, have you solved this problem? Thank you.

xiangrukui commented 1 year ago

How is this problem solved?Thank you. ValueError: Must specify max_steps > 0, given: 0