Closed rnyak closed 1 year ago
I've narrowed it down to this PR #1022 that started causing this error on reload (in an different python process to the one that saved the model) ValueError: The last dimension of the input shape of a Dense layer should be defined
I get the same error if I run this code as a python script, instead of in a jupyter nb.
I debugged the reloading process of the save transformer model and noticed that when reloading the Jupyter notebook, the step of building the transformer block from the configuration (here) is skipped, resulting in the loss of input shape information. This lead to passing an input shape of (None, None)
to the top MLP layer.
The reason behind this is that the build method of the transformer is skipped because it fails to find the custom object TFXLNetMainLayer
, which is a class from the external Hugging Face library transformers
.
I managed to avoid the error by importing transformers as well:
import merlin.models.tf as mm
import transformers
model = tf.keras.models.load_model('./saved_model')
Looks like import merlin.models.tf as mm
has a side-effect of importing the transformers
module too, so this should already be loaded when calling the load_model function in this example.
https://github.com/NVIDIA-Merlin/models/blob/729da27c5d208242e06b62b72fc21019d6af3f95/merlin/models/tf/__init__.py#L133 https://github.com/NVIDIA-Merlin/models/blob/729da27c5d208242e06b62b72fc21019d6af3f95/merlin/models/utils/dependencies.py#L44-L49
import transformers
as well --> a quick, workaround solution.is_transformers_available
function?Notes based on @oliverholworthy 's debugging:
Bug description
I am getting different errors when I try to load back a saved session-based model.
Error 1:
This error goes away if I add
import merlin.models.tf as mm
after I import tensorflow, and I get another error:Error 2:
Steps/Code to reproduce bug
Once the model is saved, please restart the kernel, and load back the model with the following script:
Expected behavior
We should be able to load back the model and then do offline evaluation or predictions, accordingly.
Environment details
Additional context