fozziethebeat / S-Space

The S-Space repsitory, from the AIrhead-Research group
GNU General Public License v2.0
203 stars 106 forks source link

Euclidian distance is calculated incorrectly for sparse vectors #40

Closed lubomirkrcmar closed 11 years ago

lubomirkrcmar commented 11 years ago

Hi, I believe there is a small bug in edu.ucla.sspace.Similarity, method: "public static double euclideanDistance(DoubleVector a, DoubleVector b)" sqrt of sum before return is missing in one brach of the method:

if (a instanceof SparseVector && b instanceof SparseVector) { SparseVector svA = (SparseVector)a; SparseVector svB = (SparseVector)b; int[] aNonZero = svA.getNonZeroIndices(); int[] bNonZero = svB.getNonZeroIndices(); HashSet sparseIndicesA = new HashSet( aNonZero.length); double sum = 0; for (int nonZero : aNonZero) { sum += Math.pow((a.get(nonZero) - b.get(nonZero)), 2); sparseIndicesA.add(nonZero); } for (int nonZero : bNonZero) if (!sparseIndicesA.contains(nonZero)) sum += Math.pow(b.get(nonZero), 2); return sum; }

davidjurgens commented 11 years ago

Thanks for the bug report! You are correct that the branch was missing the sqrt. I've updated the master branch, so it you do a "git pull", it will pick up the correction.

I've also integrated a lot of other minor changes we were slow in getting out. :)

Cheers, David

On Fri, May 17, 2013 at 12:03 PM, lubomirkrcmar notifications@github.comwrote:

Hi, I believe there is a small bug in edu.ucla.sspace.Similarity, method: "public static double euclideanDistance(DoubleVector a, DoubleVector b)" sqrt of sum before return is missing in one brach of the method:

if (a instanceof SparseVector && b instanceof SparseVector) { SparseVector svA = (SparseVector)a; SparseVector svB = (SparseVector)b; int[] aNonZero = svA.getNonZeroIndices(); int[] bNonZero = svB.getNonZeroIndices(); HashSet sparseIndicesA = new HashSet( aNonZero.length); double sum = 0; for (int nonZero : aNonZero) { sum += Math.pow((a.get(nonZero) - b.get(nonZero)), 2); sparseIndicesA.add(nonZero); } for (int nonZero : bNonZero) if (!sparseIndicesA.contains(nonZero)) sum += Math.pow(b.get(nonZero), 2); return sum; }

— Reply to this email directly or view it on GitHubhttps://github.com/fozziethebeat/S-Space/issues/40 .