Incorrect values for sentences from get_word_distance and get_nn

pommedeterresautee / fastrtext

R wrapper for fastText

Other

101 stars 15 forks source link

Hello, first of all, thank you for this package. I’m interested in cosine similarities between sentences or between word and sentences. The following code I believe produces correct results:

pv <- get_sentence_representation(mod, c("she was", "and to") )
pv <- t(pv)

# using lsa package
lsa::cosine(pv)

# manual
v1 <- as.numeric(pv[,1])
v2 <- as.numeric(pv[,2])
sum(v1*v2) / ( sqrt(sum(v1*v1)) * sqrt(sum(v2*v2)) )

The manual way and lsa produce the same results. However, I obtain different results if I try to use get_word_distance (same similarity score than get_nn): 1 - get_word_distance(mod, "she was", "and to")

Is it correct that get_word_distance does not work with sentences? If so, it would be very helpful to get an error message instead of some value.

Thank you, Luca

pommedeterresautee / fastrtext

Incorrect values for sentences from get_word_distance and get_nn #30