The main goal of this work is to introduce techniques that can be used for learning high-quality embedding chord vectors from sequences of polyphonic music. We aim to achieve this by finding chord representations that are useful for predicting the neighboring chords in a musical piece.
Please refer to the written report for information on notations, etc.
This model assumes conditional independence between the notes in a the context chord c given a chord d:
This model decomposes the context chord probability distribution according to the chain rule:
Sequence-to-sequence models allow to learn a mapping of input sequences of varying lengths (a chord) to output sequences also of varying lengths (a neighbor chord) . It uses a neural network architecture known as RNN Encoder-Decoder. The model estimates the conditional probability of a context chord c given an input chord d by first obtaining the fixed-length vector representation v of the input chord (given by the last state of the LSTM encoder) and then computing the probability of c with the LSTM decoder.