pythonlessons / mltu

Machine Learning Training Utilities (for TensorFlow and PyTorch)
MIT License
161 stars 101 forks source link

How to change the decoder to any transformer architecture ? #4

Closed deshwalmahesh closed 1 year ago

deshwalmahesh commented 1 year ago

I f you want to use a pre trained Transformer for the same task, how would you use it instead of LSTM here? For example I want to use a lightweight BERT model, what would ne the changes to the line in the end? Trying to grasp the knowledge of the architecture.

 squeezed = layers.Reshape((x7.shape[-3] * x7.shape[-2], x7.shape[-1]))(x7)

    blstm = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(squeezed)

    output = layers.Dense(output_dim + 1, activation='softmax', name="output")(blstm)

    model = Model(inputs=inputs, outputs=output)
pythonlessons commented 1 year ago

If it's about images, you would need to use Image Transformer, and to answer your question I would need to create a separate tutorial. Can't give you a quick question without trying to do it myself