Kyubyong / transformer

A TensorFlow Implementation of the Transformer: Attention Is All You Need
Apache License 2.0
4.3k stars 1.3k forks source link

This model cannot handle extremely large dataset #43

Open RayXu14 opened 6 years ago

RayXu14 commented 6 years ago

Just to point out that use tf.convert_to_tensor -> tf.train.slice_input_producer -> tf.train.shuffle_batch will meet an error

ValueError: Cannot create a tensor proto whose content is larger than 2GB.

if dataset is too large

4pal commented 6 years ago

it means that you have very long sentences in your datasets which consumer a lot of memory during baching. you need to summarize your dataset line by line using an intelligent summarize like textrank which is supervised --doesn't require training. Then try this summarized dataset ..it can even be 10 GB and you won't have any problem.

xlniu commented 5 years ago

you can use feed_dict or tf.data