keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.91k stars 19.45k forks source link

[Request] A more comprehensive guide/API for creating seq2seq models #12144

Closed MrinalJain17 closed 3 years ago

MrinalJain17 commented 5 years ago

Hey guys,

I've been using Keras for quite a while now, and since most of my work used unstructured data (images, videos), I am familiar with the way one has to utilize the Keras API (both sequential and functional) for convolutional neural networks in general.

Recently, I've started working on time series data, and I am trying to make use of seq2seq model architecture for the same. I have gone through a lot of blog posts, tutorials, code, ..., but none of these resources were able to provide an explanation on How to actually create a reusable and modular seq2seq model compatible with the standard Keras workflow.

I have also looked at some promising libraries based on Keras (like seq2seq by @farizrahman4u ), but it seems like they are not in active development (just based on the last commit).

Also, most of the available resources take NLP tasks as the base to elaborate upon the theme of seq2seq models, where the workflow is a bit different than it would be for time series data.

I think some example/tutorial in the docs explaining the usage/caveats of implementing such an architecture in Keras would be really helpful. From what I have learned so far, I think the Model sub-classing API could be the key for implementing the seq2seq models. If yes, how can one most efficiently and correctly make a generic framework for seq2seq models?

Up until now, this is what I had in my mind -

  1. Have an Encoder class
  2. A Decoder class
  3. A high-level Seq2seq class that encapsulates the encoder and the decoder, and implements various training techniques (like teacher-forcing) and inference of output.
import keras

class Encoder(keras.Model):
    def __init__(self, input_shape, hidden_units, depth, cell_type="GRU"):
        self.input_shape = input_shape
        self.hidden_units = hidden_units
        self.depth = self.depth  # No. of LSTM/GRU cells/layers
        self.cell_type = cell_type

    def build(self):
        # Build the model
        pass

class Decoder(keras.Model):
    def __init__(self, input_shape, hidden_units, depth, initial_state cell_type="GRU"):
        self.input_shape = input_shape
        self.hidden_units = hidden_units
        self.depth = self.depth  # No. of LSTM/GRU cells/layers
        self.initial_state = initial_state  # The last state of the encoder
        self.cell_type = cell_type

    def build(self):
        # Build the model
        pass

class Seq2Seq(keras.Model):
    def __init__(self, encoder, decoder, ):
        self.encoder = encoder
        self.decoder = decoder

    def build(self):
        # Build the model
        pass

The benefit of such an API could be that one can build different decoders (like with Attention), and just pass the instance to the high-level Seq2seq model. This can also serve as a solid example of utilizing the model sub-classing API (although I might be wrong).

This request might be unconventional, but any help is appreciated. I am really struggling to figure out how to do such a thing.

Thanks.

Related issues -

5738

4885

gabrieldemarmiesse commented 5 years ago

Have you checked out those ressources? https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py We'll add the files present in the examples directory in the keras docs in the future.