Open SAIVENKATARAJU opened 3 years ago
Unsupervised training does not yet work well. Unsupervised learning is still in an active research phase and the performances are not yet as good as the performances of the pre-trained models.
There are several pre-trained models, that work quite well for most use cases. If you need better performances, I recommend to create training data and to do supervised training on it.
But when i checking the documentation, w.r.t data making is same for the both the approaches. especially for the semantic similarity. just confused what makes difference between these two approaches
For the one you have some labeled or structured data, which you exploit, like (question, answer) pairs.
For unsupervised approaches, you just have text without any labels or structure.
Hi,
I have some bunch of PDF's and I am building a QnA system from the pdf's. Currently, I am using deepset/haystack repo for the same task.
My doubt is if we want to generate embeddings for my text which training I should do, what is the difference as both approaches mostly takes sentences right?