Closed AsmaZbt closed 6 years ago
sorry for the late reply, can you please provide a url to the line with this line of code?
Hello! no problem , thank you for the replying ^_^ this is the function of the previous version :
def similar_top_opt3(vec, words, topn=200, nthreads=12, freq=None): vec.init_sims()
indices = [vec.vocab[w].index for w in words if w in vec.vocab]
vecs = vec.syn0norm[indices]
dists = np.dot(vecs, vec.syn0norm.T)
if freq is not None:
dists = dists * np.log(freq)
if nthreads==1:
res = dists2neighbours(vec, dists, indices, topn)
else:
batchsize = int(ceil(1. * len(indices) / nthreads))
print >> stderr, "dists2neighbours for %d words in %d threads, batchsize=%d" % (len(indices), nthreads, batchsize)
def ppp(i):
return dists2neighbours(vec, dists[i:i+batchsize], indices[i:i+batchsize], topn)
lres = parallel_map(ppp, range(0,len(indices),batchsize), threads=nthreads)
res = OrderedDict()
for lr in lres:
res.update(lr)
return res
thank you so much (y)
Sorry for a so late answer: this part of the code was provided by a contributor and I was not sure about it. In fact, this code is currently removed from the repository because we now use FAISS facebook library for computing nearest neighbors instead of this numpy code.
I'm note sur if i can ask you here , about the previous version of sensegram , so excuse me if it's not the right place here, I think this new version is so advanced for me, so I prefer strating from the beginning
in the function : similar_top_opt3 (...) when you have compute the similarty between the arrays of the distances ( dists = np.dot(vec, vec.syn0norm.T) and the array of the frequencies like this :
vecs = vec.syn0norm[indices] dists = np.dot(vecs, vec.syn0norm.T)
I do not understand why you have multiplied the distance with the log of frequencies? can you explain to me please