Closed WillKoehrsen closed 6 years ago
Yeah, this is sloppily formulated. Here's the full(er) story. Originally we wanted to create a question answer model like you are suggesting. But we couldn't quite make it work with a model the size of a notebook (or even beyond that). So we settled for something less ambitious, finding similar questions. We do this by learning the association of a title and a body, which gives you an embedding space for the titles and now you can search for nearest neighbors. Sorry about the confusion!
No problem! I enjoyed the example and thought it to be useful, just was a little confused by the wording. Thanks for explaining the full backstory (maybe you could add the explanation into the book) .
It would be an interesting project to try and take on that question and answer model!
In the code for Chapter 6, it seems to me that the model is trying to learn how to associate a title of a question with the body and not with the answer. Is this the correct interpretation? I'm a little confused because there are numerous places in the text where it looks like it's stated the model is learning to associate the questions and answers, such as here:
This is somewhat confusing because in most cases, the body is itself a question. I think the terms title and body should be used instead of question and answer.
It seems like it would be possible to associate the questions with the accepted answers since these are also included in the dataset.