google-research / uda

Unsupervised Data Augmentation (UDA)
https://arxiv.org/abs/1904.12848
Apache License 2.0
2.18k stars 312 forks source link

failed at example_parsing_ops.cc:240 #52

Open LaVineChan opened 5 years ago

LaVineChan commented 5 years ago

2019-10-06 15:22:11.087313: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_type_ids. Can't parse serialized Example. 2019-10-06 15:22:11.087317: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_type_ids. Can't parse serialized Example. 2019-10-06 15:22:11.087426: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_type_ids. Can't parse serialized Example. 2019-10-06 15:22:11.087477: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_type_ids. Can't parse serialized Example. 2019-10-06 15:22:11.087437: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_type_ids. Can't parse serialized Example. 2019-10-06 15:22:11.087465: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_type_ids. Can't parse serialized Example. 2019-10-06 15:22:11.087480: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_type_ids. Can't parse serialized Example. 2019-10-06 15:22:11.087536: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_type_ids. Can't parse serialized Example. 2019-10-06 15:22:11.087566: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: input_ids. Can't parse serialized Example.

tomgoter commented 5 years ago

Hello, I am also receiving the above error when attempting to run the run_base.sh script. I have created a virtual environment with Python 2.7 and Tensorflow 1.13.2. Is there anything else you can suggest? Thanks!

Ritali commented 4 years ago

I had the same error and solved, try as follows:

  1. Need to make sure preprocess and train have same max_seq_length
  2. Try to reduce your max_seq_len if also error after 1. BERT require big GPU's memory to run. If your GPU's memory is small, you need to reduce your max_seq_len or batch_size

reference: https://github.com/google-research/bert/issues/283

jes-moore commented 4 years ago

This seems a little odd. The readme explicitly states that the max_seq_length of the data prep and the training steps don't need to be identical.

Longer sequences are disproportionately expensive because attention is quadratic to the sequence length. In other words, a batch of 64 sequences of length 512 is much more expensive than a batch of 256 sequences of length 128. The fully-connected/convolutional cost is the same, but the attention cost is far greater for the 512-length sequences. Therefore, one good recipe is to pre-train for, say, 90,000 steps with a sequence length of 128 and then for 10,000 additional steps with a sequence length of 512. The very long sequences are mostly needed to learn positional embeddings, which can be learned fairly quickly. Note that this does require generating the data twice with different values of max_seq_length.

I think this solely has to do with an OOM issue on the GPU? Still debugging on my end.