Open tomazweiss opened 2 years ago
We can definitely change course, but the original example worked and seemed clear to me. ` library(tidyverse)
sentences <- c("Baby turtles are so cute!", "He walks as slowly as a turtle.","The lake is cold today.", "I enjoy swimming in the lake.") model <- hf_load_sentence_model('paraphrase-MiniLM-L6-v2') embeddings <- model$encode(sentences) embeddings
embeddings %>% dist() %>% as.matrix() %>% as.data.frame() %>% setNames(sentences) %>% mutate(sentence 1
= sentences) %>%
pivot_longer(cols = -sentence 1
, names_to = 'sentence 2', values_to = 'distance') %>% filter(distance > 0)
embeddings %>% t() %>% prcomp() %>% pluck('rotation') %>% as.data.frame() %>% mutate(sentence = sentences) %>% ggplot(aes(PC1, PC2)) + geom_label(aes(PC1, PC2, label = sentence, vjust="inward", hjust="inward")) + theme_minimal() `
I ran @samterfa code above and was successful after adding back ticks to "sentence 1". This matches what we have in the example so it should be good to go.
`library(tidyverse)
sentences <- c( "Baby turtles are so cute!", "He walks as slowly as a turtle.", "The lake is cold today.", "I enjoy swimming in the lake." )
model <- hf_load_sentence_model('paraphrase-MiniLM-L6-v2')
embeddings <- model$encode(sentences) embeddings
embeddings %>%
dist() %>%
as.matrix() %>%
as.data.frame() %>%
setNames(sentences) %>%
mutate(sentence 1
= sentences) %>%
pivot_longer(
cols = -sentence 1
,
names_to = 'sentence 2',
values_to = 'distance'
) %>%
filter(distance > 0)
embeddings %>% t() %>% prcomp() %>% pluck('rotation') %>% as.data.frame() %>% mutate(sentence = sentences) %>% ggplot(aes(PC1, PC2)) + geom_label(aes(PC1, PC2, label = sentence, vjust="inward", hjust="inward")) + theme_minimal()`
@tomazweiss example is really close to this to. @tomazweiss could you point us to the error you are getting?
It looks like @jpcompartir changed the example here. Maybe he was seeing an error?
@farach, I was correcting example in this file: https://github.com/farach/huggingfaceR/blob/main/R/sentence-transformers.R , which is different from what @samterfa is pasting above.
There is a typo (embddings) and you are updating the embeddings object and then using the previous version in plot.
It looks like @jpcompartir changed the example here. Maybe he was seeing an error?
This just looks like me being careless - by the by - the examples (across the package) were temporarily removed to speed up running check() (also didn't feel like adding to buildignore) - they're likely to be put back in as usage, or to figure in vignettes. So I may do the same here, until we're ready to go with release and tests etc. have been added appropriately.
Previous example didn't work.