Closed devinharia closed 5 years ago
I got the same issue
Same for me, also the result of the model.predict reports almost the same exact confidence score for each entry, which is quite weird, like it's not learning anything at all.
Hi @devinharia @tienduccao and @LFavano, to see convergence here you'll need to reduce the learning rate. With 1e-5
I see nice convergence on this task.
@jacobzweig I don't get it, could you be more specific? How to set the learning rate?
I think there is still some secret/trick behind the code that without that it does not work. I tried and got only 0.53 or so accuracy. I felt like the model does not learn anything during training. Also the number of parameters does not really match, as @devinharia pointed out.
@jacobzweig: your answer is unfortunately totally not satisfied and all of us truly need a more elaboration on this, if you have time. Thank you very much!
The learning rate of the optimizer needs to be set before the model is compiled:
opt = tf.keras.optimizers.Adam(learning_rate=1e-5) model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
Learning will now converge (although it does not answer the original query re: difference in number of parameters).
I ran the code from 'keras-bert.ipynb' as it is and observed that the number of trainable parameters in my run is '22,051,329' instead of '3,147,009' in your run of the notebook. Also my accuracy is just about 0.53. Can you please help me out. Thanks!