This PR implements a baseline siamese style architecture for sentence selection.
We basically encode the question and each candidate sentence with some encoder, then take the cosine similarity and normalize to get a probability distribution over candidate sentences.
This PR implements a baseline siamese style architecture for sentence selection.
We basically encode the question and each candidate sentence with some encoder, then take the cosine similarity and normalize to get a probability distribution over candidate sentences.