Closed GoogleCodeExporter closed 9 years ago
The simplest way is just to write an n-squared loop over a vector store in RAM
to give the pairwise similarities.
The problem is usually space - 10000 vectors of (say) 250 real dimensions
expressed as 4-byte floats takes 10000 * 1000 = 10 MB, whereas pairwise
similarities for this many would be 400MB. Naturally you could try to optimize
space consumption by discarding small values and using a sparse matrix
representation.
Original comment by dwidd...@gmail.com
on 6 Mar 2014 at 5:20
I'm going to close this for now, pending a clearer specification of what we
mean by a matrix. (It's clear as a mathematical abstraction but not clear as an
output format specification, there are many options and some would not scale
well.)
Original comment by dwidd...@gmail.com
on 19 Nov 2014 at 7:37
Original issue reported on code.google.com by
rohitdee...@gmail.com
on 6 Mar 2014 at 11:12