shafiiganjeh / GPT2-plus-EfficientNet-image-captioning

Using GPT2 and any EfficientNet model for image captioning
MIT License
3 stars 1 forks source link

Error creating the model while using the tutorial.ipnyp #10

Open muhammad-azmain-mahtab opened 7 months ago

muhammad-azmain-mahtab commented 7 months ago
with open("/kaggle/working/tmp/model/config.json", "br") as read_file:
    hparams = json.load(read_file)

enc,decoder,model_final = create_model(efficient_net = img_enc,
                                       hparams = hparams,
                                       n_spe = 1, #The number of special tokens to add starting at max_vocab + 1. useful if you want to use this model for task other than captioning.
                                       img_size = (256,256),
                                       emb_train = False, #If you want to also train the decoder-embedding layer (usually you don't)
                                       train = True, # set this to False if you want to load the model for inference.
                                       encoder_layers = 2, # number of encoder layers (min 1)
                                       encoder_head = hparams["n_head"]) #number of encoder heads

/opt/conda/lib/python3.10/site-packages/keras/src/layers/layer.py:1210: UserWarning: Layer 'encoder' looks like it has unbuilt state, but Keras is not able to trace the layer call() in order to build it automatically. Possible causes:

  1. The call() method of your layer may be crashing. Try to __call__() the layer eagerly on some test input first to see if it works. E.g. x = np.random.random((3, 4)); y = layer(x)
  2. If the call() method is correct, then you may need to implement the def build(self, input_shape) method on your layer. It should create all variables used by the layer (e.g. by calling layer.build() on all its children layers). Exception encoutered: ''Layer.add_weight() got multiple values for argument 'shape''' warnings.warn( /opt/conda/lib/python3.10/site-packages/keras/src/layers/layer.py:359: UserWarning: build() was called on layer 'encoder', however the layer does not have a build() method implemented and it looks like it has unbuilt state. This will cause the layer to be marked as built, despite not being actually built, which may cause failures down the line. Make sure to implement a proper build() method. warnings.warn(

    TypeError Traceback (most recent call last) Cell In[9], line 4 1 with open("/kaggle/working/tmp/model/config.json", "br") as read_file: 2 hparams = json.load(read_file) ----> 4 enc,decoder,model_final = create_model(efficient_net = img_enc, 5 hparams = hparams, 6 n_spe = 1, #The number of special tokens to add starting at max_vocab + 1. useful if you want to use this model for task other than captioning. 7 img_size = (256,256), 8 emb_train = False, #If you want to also train the decoder-embedding layer (usually you don't) 9 train = True, # set this to False if you want to load the model for inference. 10 encoder_layers = 2, # number of encoder layers (min 1) 11 encoder_head = hparams["n_head"]) #number of encoder heads

File /kaggle/working/tmp/image_model/im_model/create_model.py:25, in create_model(efficient_net, hparams, emb_train, train, n_spe, img_size, encoder_layers, encoderhead) 20 input("Press Enter to continue...") 22 enc = md.encoder(n_ctx = enc_inp.shape[1]*enc_inp.shape[2],train = train,n_embd = hparams["n_embd"], 23 n_layer = encoder_layers,n_head = encoder_head) ---> 25 encoutp = enc(enc_inp) 26 enc = tf.keras.Model(inputs=[inp_img], outputs=[enc_outp]) 28 decoder = md.model(cross = True,emb_train = emb_train, n_spe = n_spe,train = train, 29 n_vocab = hparams["n_vocab"],n_ctx = hparams["n_ctx"], 30 n_embd = hparams["n_embd"],n_head = hparams["n_head"], 31 n_layer = hparams["n_layer"])

File /opt/conda/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py:123, in filter_traceback..error_handler(*args, **kwargs) 120 filtered_tb = _process_traceback_frames(e.traceback) 121 # To get the full stack trace, call: 122 # keras.config.disable_traceback_filtering() --> 123 raise e.with_traceback(filtered_tb) from None 124 finally: 125 del filtered_tb

File /kaggle/working/tmp/image_model/im_model/models.py:139, in encoder.call(self, x) 135 h = self.pre_emb(h) 137 print("in call h:", h) --> 139 h = self.pos(h) + h 141 for i in range(self.n_layer): 142 h, present = self._block[i](x = h, past = None,y = None)

File /kaggle/working/tmp/image_model/layers/tf_keras_embedding.py:73, in PositionEmbedding.build(self, input_shape) 69 weight_sequence_length = self._max_length 71 print(weight_sequence_length, width) ---> 73 self._position_embeddings = self.add_weight( 74 "embeddings", 75 shape=(weight_sequence_length, width), 76 initializer=self._initializer) 78 super().build(input_shape)

TypeError: Exception encountered when calling encoder.call().

Layer.add_weight() got multiple values for argument 'shape'

Arguments received by encoder.call(): • args=('<KerasTensor shape=(None, 8, 8, 1280), dtype=float32, sparse=False, name=keras_tensor_515>',) • kwargs=<class 'inspect._empty'>

shafiiganjeh commented 7 months ago

Hi, I could not reproduce an error like this neither locally or on colab. Can you send the full code snippet where this happened? Edit: The encoder is of class "keras.Model", it is not a layer (it does not have a build method). Maybe you meant to call compile on it instead?

muhammad-azmain-mahtab commented 7 months ago

https://www.kaggle.com/code/muhammadazmainmahtab/efficientnetv2-gpt2-captioning

I have attached the kaggle notebook url where i ran the tutorial.ipnyp

shafiiganjeh commented 7 months ago

The issue is from tensorflow 2.16 (that is used by kaggle) or more specifically keras 3 that gets shipped with it which broke some naming conventions and other things... As a workaround you can initiate your notebook with tensorflow 2.15 and make sure that you also use keras 2.15 ( something like !pip uninstall tensorflow then !pip install tensorflow==2.15.0 should do). Ill try to fix the issue as soon as possible.