import numpy as np
import torch
np.random.seed(4)
#define a vector
np_test_vector = np.random.uniform(0,1, (3,10)) #1e-14 * np.ones((3,10))
print("shape of np test vector:", np_test_vector.shape)
torch_test_vector = torch.tensor(np_test_vector)
print("shape of torch test vector:", torch_test_vector.shape)
#From (line 184) https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/SentenceTransformer.py
original_norm = torch.nn.functional.normalize(torch_test_vector, p=2, dim=1)
print("shape of torch normalized:", original_norm.shape)
print("sum of squared is:", torch.sum(original_norm[0]**2))
#From (line 292) https://github.com/Pandora-Intelligence/fast-sentence-transformers/blob/main/fast_sentence_transformers/FastSentenceTransformer.py
#Incorrect Shape
new_norm = np.linalg.norm(np_test_vector, ord=2, axis=1, keepdims=False)
print("shape of current numpy code:", new_norm.shape)
print("we need to divide the original by these values:", new_norm)
#Fixed code
fixed_norm = (np_test_vector/np.linalg.norm(np_test_vector, ord=2, axis=1, keepdims=True))
print("shape from corrected code:", fixed_norm.shape)
#To verify
print("sum absolute difference to verify:", torch.sum(torch.abs(original_norm - torch.tensor(fixed_norm))))
output:
shape of np test vector: (3, 10)
shape of torch test vector: torch.Size([3, 10])
shape of torch normalized: torch.Size([3, 10])
sum of squared is: tensor(1., dtype=torch.float64)
shape of current numpy code: (3,)
we need to divide the original by these values: [2.10508935 1.95157797 1.89443896]
shape from corrected code: (3, 10)
sum absolute difference to verify: tensor(3.6776e-16, dtype=torch.float64)
Minimum reproducible example
The following code
has the output
when it should be
Fix
The code in line 292 of FastSentenceTransformer.py is
I think it should be either
or
so that the output is same as the original SentenceTransformer.py.
Supporting code and output
code:
output:
Other notes
if the eps=1e-12 in the original documentation is required the following can be used.
We can confirm that it is working by using
np_test_vector = 1e-14 * np.ones((3,10))
in the supporting code