dbiir / UER-py

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
https://github.com/dbiir/UER-py/wiki
Apache License 2.0
3.01k stars 525 forks source link

cosine similarity为什么会出现负值?用sentence_transformers的util.cos_sim()调用uer/sbert-base-chinese-nli #344

Open peter65374 opened 2 years ago

peter65374 commented 2 years ago

跑个unit test的时候例句得到负值的cosine similarity,这个是怎么回事?是util.cos_sim函数的问题么?

image
        try:
            logger.info("START - 加载 Sen-SIMILARITY 模型")
            # model = SentenceTransformer('distiluse-base-multilingual-cased-v2')
            model = SentenceTransformer('uer/sbert-base-chinese-nli')  # uer model中文性能好很多。
            logger.info("FINISH - 加载 Sen-SIMILARITY 模型")
        except Exception as e:
            logger.warning("Exception thrown during Intialising pretrained model.", e)

       try:
            # Compute embedding for both lists
            embedding1 = model.encode(sentence1)
            embedding2 = model.encode(sentence2)

            #Compute cosine-similarities
            simcos = util.cos_sim(embedding1, embedding2)

            return simcos
        except Exception as e:
            logger.warning("Exception thrown during get similarity", e)
            return None`