make the module recurrent

jemisjoky / TorchMPS

PyTorch toolbox for matrix product state models

MIT License

138 stars 31 forks source link

make the module recurrent #4

Closed allanj closed 5 years ago

allanj commented 5 years ago

Do you plan to make this module recurrent such as torch.nn.LSTM? Just wondering if we can apply to this to RNN.

allanj commented 5 years ago

It also seems this implementation is specifically for images.

Because I saw the constructor of MPS has the following operation:

        if label_site is None:
            label_site = input_dim // 2
        assert label_site >= 0 and label_site <= input_dim

Let me know if I'm wrong

jemisjoky commented 5 years ago

Yes, a recurrent module is currently in development and should be included in the master branch soon. In the meantime, the TI_MPS class currently included in the dynamic_capacity branch uses many copies of a single repeated core tensor to evaluate an input sequence, and should provide the recurrent functionality you're looking for.

jemisjoky commented 5 years ago

The input data to an MPS instance can be anything, as long as it has a fixed size. The original code was written with image data in mind, but there's nothing in the code which specializes to that case.

jemisjoky commented 5 years ago

        if label_site is None:
            label_site = input_dim // 2
        assert label_site >= 0 and label_site <= input_dim
The input is processed in a linear fashion (since MPS is a linear data structure), and that code block just handles where in that linear sequence the output is generated. This output placement won't fundamentally change the expressivity of the MPS model, but could bias the model to be more sensitive to certain regions of the input neighboring the output site.

jemisjoky commented 5 years ago

Just a follow-up that the recurrent TI_MPS class has been cleaned up, and can now be found in the master branch. Matching documentation should be uploaded soon!

allanj commented 5 years ago

Great. I will try it out.

allanj commented 5 years ago

If I understand the code correctly, this class takes input with size (batch_size, seq_length, input_dim) and gives output size (batch_size, output_dim).

Is it possible to have another class with output size (batch_size, seq_length, output_dim)?

jemisjoky commented 5 years ago

Good question! That would technically be possible, but doesn't mix well with the structure of tensor networks. Although MPS are sequential models, the fact that all the operations are (multi-)linear makes their evaluation much more flexible than traditional RNN's. For example MPS are very parallelizable and can be evaluated in depth that's only logarithmic with the sequence length, something that becomes impossible once we start requiring copies of the hidden state vectors at each step of the evaluation (this is vaguely related to the no cloning theorem of quantum mechanics).

Because the underlying contraction methods assume this type of flexibility, it would require some significant changes to the code to implement the class you're talking about. I would like to rewrite the contraction engine once I have a more complete picture of what tasks users are applying TorchMPS towards, but right now I don't see a sequence-to-sequence model being developed anytime soon. Sorry!

allanj commented 5 years ago

Thanks. I have to learn more about this.