Closed etetteh closed 3 years ago
Hi @etetteh ,
could you provide the value for DATA_DIR
:thinking:
E.g. when DATA_DIR
is set to gs://tr-electra
then your .tfrecords
must be located under gs://tr-electra/output-512/
- this depends on the value pretrain_tfrecords
that you've specified in configure_pretraining.py
.
I used the following configuration:
Also make sure, that you're using a TensorFlow 1.xx version (I think I trained the models with 1.15).
Yes that's true. Thanks so much. So the problem was in the configure_pre-training I used a max sequence length of 128 but in the run_pretraining I was using 512 that was what was giving me the problem.
Yes. Also I'm using Tensorflow version 1.15.4, but it's working fine. Thanks a lot.
On Tue, Nov 17, 2020, 16:45 Stefan Schweter notifications@github.com wrote:
Also make sure, that you're using a TensorFlow 1.xx version (I think I trained the models with 1.15).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stefan-it/turkish-bert/issues/20#issuecomment-729232625, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGZQ72H7WMT753WI3W4R6Q3SQLVGXANCNFSM4TYIAPZA .
I have built my pretraining data and stored it, the vocab file and config file in my GCS bucket. But when I run the pretraining step:
I keep getting the following