tensorflow / text

Making text a first-class citizen in TensorFlow.
https://www.tensorflow.org/beta/tutorials/tensorflow_text/intro
Apache License 2.0
1.22k stars 338 forks source link

Neural machine translation with a Transformer and Keras(tensorflow.ipynb) training query #1205

Closed akashghimireOfficial closed 1 year ago

akashghimireOfficial commented 1 year ago

Hello, Firstly, I would like to thank you for putting this amazing tutorial on machine translation with a transformer and keras. It taught me more about transformers in general and also introduced me to machine translation. When I was learning the tutorial was made on the tf2.9 version(This one). I understand the code perfectly. Now, we have an updated version of this. One of the differences I have found is how we are training it. In the older version, we did not use the model.fit() method to train instead we trained by defining the train_step() and now in the updated version we are training using the convenient method model.fit(). When I try to train using the model.fit() to train on an older version it does not work. So, my question is more of a basic understanding of TensorFlow itself. In the older version(This one) why did you not train using model.fit()? There must be some reason, right?

cantonios commented 1 year ago

This has nothing to do with tf-text.

akashghimireOfficial commented 1 year ago

Hello, thank you for your quick response. The threat I asked about falls under https://github.com/tensorflow/text/tree/master/docs/tutorials. So, I thought this was the right place to ask my query. Or, maybe I am wrong. If my query has nothing to do here, can you please guide me to where can i ask a related question about "text generation with BERT from scratch."

cantonios commented 1 year ago

You are asking a question about Keras model.fit(). Your best bet for why this doesn't work is to ask there.

The older version of the tutorial just had a simple custom training loop. I don't think there's any particular reason for this. The updated one is more in-line with what models look like today. You can override the training step in keras pretty easily: https://keras.io/guides/customizing_what_happens_in_fit/