Closed mrenlivex closed 1 week ago
Hello!
Dense NLP models remain black boxes that can make mistakes, especially when the inputs differ a lot from the inputs encountered during training. https://huggingface.co/cross-encoder/quora-distilroberta-base was trained on Quora questions, so it's not too surprising that it doesn't do well with:
"FAQ"
Perhaps you'll find better luck with some other cross-encoder, but there'll likely always be outliers where the model gives odd results. I'll close this for now, as I don't think we can fix this issue outright.
similarity between
"What are the differences between iHealth Nexus Wireless Body Composition Scale (HS2S) (Nexus/Fit) and iHealth Nexus Pro Wireless Body Composition Scale (HS2S Pro) (NEXUS PRO)"
and
"FAQ"
0.9063025712966919