Closed DavidGeorge528 closed 4 months ago
Hi David, what OS are you using and also which version of tensorflow do you have installed? I believe I have seen this error before but it was because I had an earlier version of tensorflow installed and the file format was not compatible with other versions of tensorflow. I can look into this further to confirm once you answer these questions.
Hi Justin, I've tried it locally on MacOS and remotely on AWS with an Ubuntu 20.04 server. On both I've tried the latest tensorflow and 2.13 as per https://github.com/ourresearch/openalex-topic-classification/blob/main/v1/requirements.txt. Neither work for me. Thanks in advance for looking into it
The error seems to be with h5py, which is what TF is using to load the file, I have 3.11.0 installed, which is what tf 2.13 installed as a dependency
I have uploaded a new keras file to the zenodo link: https://zenodo.org/records/11221637
Please try that and if that doesn't work, let me know. I do not think that is going to solve the issue but I want to try the easiest thing first before I get further into it.
Hi thanks for the quick reply, unfortunately that didn't work either. Perhaps its the h5py version?
That is potentially an issue, I have h5py==3.10.0 listed in the requirements file. I can look into this soon and get back to you with what I find.
I've tried it with h5py==3.10.0 and still getting the same error unfortunately
One other quick question before I look into it, are you trying to set up your own docker container or are you just taking the predict.py code and trying to run that on it's own?
Initially I was trying to run the code as is in docker. But when I hit some of the issues I simplified the issue to just the create_model
function and was trying to run it locally on my Mac (And even on a simpler docker container that just loads the model and nothing else)
So unfortunately I am unable to reproduce your error. I reduced the code down to the minimum needed in order to load the model and I also used the exact file I uploaded to zenodo:
import os
import pickle
import tensorflow as tf
prefix = './model_artifacts/'
model_path = os.path.join(prefix, 'model')
#### Load the needed files
with open(os.path.join(model_path, "target_vocab.pkl"), "rb") as f:
target_vocab = pickle.load(f)
print("Loaded target vocab")
with open(os.path.join(model_path, "citation_feature_vocab.pkl"), "rb") as f:
citation_feature_vocab = pickle.load(f)
print("Loaded citation features vocab.")
#### Load the model
def create_model(num_classes, emb_table_size, model_chkpt, topk=5):
# Function to create full model.
# Input:
# num_classes: number of classes
# emb_table_size: size of embedding table
# model_chkpt: path to model checkpoint
# topk: number of predictions to return
# Output:
# model: full model
# Inputs
citation_0 = tf.keras.layers.Input((16,), dtype=tf.int64, name='citation_0')
citation_1 = tf.keras.layers.Input((128,), dtype=tf.int64, name='citation_1')
journal = tf.keras.layers.Input((384,), dtype=tf.float32, name='journal_emb')
language_model_output = tf.keras.layers.Input((512, 768,), dtype=tf.float32, name='lang_model_output')
# Create a multi-class classification model using functional API
pooled_language_model_output = tf.keras.layers.GlobalAveragePooling1D()(language_model_output)
citation_emb_layer = tf.keras.layers.Embedding(input_dim=emb_table_size, output_dim=256, mask_zero=True,
trainable=True, name='citation_emb_layer')
citation_0_emb = citation_emb_layer(citation_0)
citation_1_emb = citation_emb_layer(citation_1)
pooled_citation_0 = tf.keras.layers.GlobalAveragePooling1D()(citation_0_emb)
pooled_citation_1 = tf.keras.layers.GlobalAveragePooling1D()(citation_1_emb)
concat_data = tf.keras.layers.Concatenate(name='concat_data', axis=-1)([pooled_language_model_output, pooled_citation_0,
pooled_citation_1, journal])
# Dense layer 1
dense_output = tf.keras.layers.Dense(2048, activation='relu', kernel_regularizer='L2', name="dense_1")(concat_data)
dense_output = tf.keras.layers.Dropout(0.20, name="dropout_1")(dense_output)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_1")(dense_output)
# Dense layer 2
dense_output = tf.keras.layers.Dense(1024, activation='relu', kernel_regularizer='L2', name="dense_2")(dense_output)
dense_output = tf.keras.layers.Dropout(0.20, name="dropout_2")(dense_output)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_2")(dense_output)
# Dense layer 3
dense_output_l3 = tf.keras.layers.Dense(512, activation='relu', kernel_regularizer='L2', name="dense_3")(dense_output)
dense_output = tf.keras.layers.Dropout(0.20, name="dropout_3")(dense_output_l3)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_3")(dense_output)
output_layer = tf.keras.layers.Dense(num_classes, activation='sigmoid', name='output_layer')(dense_output)
topk_outputs = tf.math.top_k(output_layer, k=topk)
model = tf.keras.Model(inputs=[citation_0, citation_1, journal, language_model_output],
outputs=topk_outputs)
model.load_weights(model_chkpt)
model.trainable = False
return model
pred_model = create_model(len(target_vocab),
len(citation_feature_vocab)+2,
os.path.join(model_path, "model_checkpoint/citation_part_only.keras"), topk=3)
pred_model.summary()
With the above code and the file I loaded to zenodo, the model loaded successfully. So I think that narrows it down to a package, I am assuming. The code above was done on an AWS EC2 in a conda env (python 3.10)
Hi, thanks for the minimal example, after running it in a fresh EC2 instance it worked fine, I then tried it locally on my Mac and it also worked. So I compared the differences and found that the tweaks I'd made to imports (To follow coding standards) actually had implementation impacts. So below is my modified version of your code, where I refactored tf.keras.layers
to just layers
using from tensorflow.keras import layers
, which typically wouldn't have any implementation impacts, but in this case it breaks the code.
from pathlib import Path
import tensorflow as tf
from tensorflow.python import keras
from tensorflow.python.keras import layers
def create_model(num_classes: int, emb_table_size: int, model_chkpt: Path, topk: int = 5) -> keras.Model:
"""
Function to create full model.
Input:
num_classes: number of classes
emb_table_size: size of embedding table
model_chkpt: path to model checkpoint
topk: number of predictions to return
Output:
model: full model
"""
# Inputs
citation_0 = layers.Input((16,), dtype=tf.int64, name="citation_0")
citation_1 = layers.Input((128,), dtype=tf.int64, name="citation_1")
journal = layers.Input((384,), dtype=tf.float32, name="journal_emb")
language_model_output = layers.Input((512, 768), dtype=tf.float32, name="lang_model_output")
# Create a multi-class classification model using functional API
pooled_language_model_output = layers.GlobalAveragePooling1D()(language_model_output)
citation_emb_layer = layers.Embedding(input_dim=emb_table_size, output_dim=256, mask_zero=True, trainable=True, name="citation_emb_layer")
citation_0_emb = citation_emb_layer(citation_0)
citation_1_emb = citation_emb_layer(citation_1)
pooled_citation_0 = layers.GlobalAveragePooling1D()(citation_0_emb)
pooled_citation_1 = layers.GlobalAveragePooling1D()(citation_1_emb)
concat_data = layers.Concatenate(name="concat_data", axis=-1)([pooled_language_model_output, pooled_citation_0, pooled_citation_1, journal])
# Dense layer 1
dense_output = layers.Dense(2048, activation="relu", kernel_regularizer="L2", name="dense_1")(concat_data)
dense_output = layers.Dropout(0.20, name="dropout_1")(dense_output)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_1")(dense_output)
# Dense layer 2
dense_output = layers.Dense(1024, activation="relu", kernel_regularizer="L2", name="dense_2")(dense_output)
dense_output = layers.Dropout(0.20, name="dropout_2")(dense_output)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_2")(dense_output)
# Dense layer 3
dense_output_l3 = layers.Dense(512, activation="relu", kernel_regularizer="L2", name="dense_3")(dense_output)
dense_output = layers.Dropout(0.20, name="dropout_3")(dense_output_l3)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_3")(dense_output)
output_layer = layers.Dense(num_classes, activation="sigmoid", name="output_layer")(dense_output)
topk_outputs = tf.math.top_k(output_layer, k=topk)
model = keras.Model(inputs=[citation_0, citation_1, journal, language_model_output], outputs=topk_outputs)
model.load_weights(model_chkpt.as_posix())
model.trainable = False
return model
if __name__ == "__main__":
model = create_model(4521, 6008, Path("oa_artifacts") / "model_checkpoint" / "citation_part_only.keras")
print(model.summary())
This results in the above error OSError: Unable to open file (file signature not found)
I wonder if you can reproduce the same error using my code?
If I edit the code back to using tf.keras.layers
like below, the error goes away.
from pathlib import Path
import tensorflow as tf
def create_model(num_classes: int, emb_table_size: int, model_chkpt: Path, topk: int = 5) -> tf.keras.Model:
"""
Function to create full model.
Input:
num_classes: number of classes
emb_table_size: size of embedding table
model_chkpt: path to model checkpoint
topk: number of predictions to return
Output:
model: full model
"""
# Inputs
citation_0 = tf.keras.layers.Input((16,), dtype=tf.int64, name="citation_0")
citation_1 = tf.keras.layers.Input((128,), dtype=tf.int64, name="citation_1")
journal = tf.keras.layers.Input((384,), dtype=tf.float32, name="journal_emb")
language_model_output = tf.keras.layers.Input((512, 768), dtype=tf.float32, name="lang_model_output")
# Create a multi-class classification model using functional API
pooled_language_model_output = tf.keras.layers.GlobalAveragePooling1D()(language_model_output)
citation_emb_layer = tf.keras.layers.Embedding(
input_dim=emb_table_size, output_dim=256, mask_zero=True, trainable=True, name="citation_emb_layer"
)
citation_0_emb = citation_emb_layer(citation_0)
citation_1_emb = citation_emb_layer(citation_1)
pooled_citation_0 = tf.keras.layers.GlobalAveragePooling1D()(citation_0_emb)
pooled_citation_1 = tf.keras.layers.GlobalAveragePooling1D()(citation_1_emb)
concat_data = tf.keras.layers.Concatenate(name="concat_data", axis=-1)(
[pooled_language_model_output, pooled_citation_0, pooled_citation_1, journal]
)
# Dense layer 1
dense_output = tf.keras.layers.Dense(2048, activation="relu", kernel_regularizer="L2", name="dense_1")(concat_data)
dense_output = tf.keras.layers.Dropout(0.20, name="dropout_1")(dense_output)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_1")(dense_output)
# Dense layer 2
dense_output = tf.keras.layers.Dense(1024, activation="relu", kernel_regularizer="L2", name="dense_2")(dense_output)
dense_output = tf.keras.layers.Dropout(0.20, name="dropout_2")(dense_output)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_2")(dense_output)
# Dense layer 3
dense_output_l3 = tf.keras.layers.Dense(512, activation="relu", kernel_regularizer="L2", name="dense_3")(dense_output)
dense_output = tf.keras.layers.Dropout(0.20, name="dropout_3")(dense_output_l3)
dense_output = tf.keras.layers.LayerNormalization(epsilon=1e-6, name="layer_norm_3")(dense_output)
output_layer = tf.keras.layers.Dense(num_classes, activation="sigmoid", name="output_layer")(dense_output)
topk_outputs = tf.math.top_k(output_layer, k=topk)
model = tf.keras.Model(inputs=[citation_0, citation_1, journal, language_model_output], outputs=topk_outputs)
model.load_weights(model_chkpt.as_posix())
model.trainable = False
return model
if __name__ == "__main__":
model = create_model(4521, 6008, Path("oa_artifacts") / "model_checkpoint" / "citation_part_only.keras")
print(model.summary())
Strange behaviour. But at least its solved
Ok so after a bit more digging, changing
from tensorflow.python import keras
from tensorflow.python.keras import layers
to
import keras
from keras import layers
makes the error go away in my code example above. Not sure why but it fixes the issue while letting me follow import guidelines. Thanks for your time Justin, apologies for the rabbit hole we had to go down.
So I think you actually want to do this:
from tensorflow import keras
from tensorflow.keras import layers
Not sure where you got the "python" but this should work. I don't think you should be importing directly from keras. Everything should be imported from tensorflow.
Hi, the reason I used python
in the import is because VSCodes pylance linter complains if I don't add it, see the screenshot below
But its fine, importing directly from keras
fixes the issue too.
I'm trying to run the topic classifier myself using the provided code and downloaded model files. However, while everything else loads and runs properly, the load weights function (below) crashes with an error that seems to show that the file being loaded is corrupted. https://github.com/ourresearch/openalex-topic-classification/blob/e91c2f45ef66611f438447ea29ae6b5f03b7d2f6/v1/003_Deployment/model_to_api/container/topic_classifier/predictor.py#L538
I have double checked both the file being loaded is correct and re-downloaded the file multiple times from here to ensure that it downloaded correctly. All times I get the same error, which after searching online, suggests that the file is corrupted. Is there anyway someone can verify this is the case and provide the uncorrupted file?
Thanks in advance