Closed ashvardanian closed 3 days ago
@ashvardanian , I will take a shot at trying to fix this as well :)
Cool, looking forward to your solution, @GoWind 🤗
Hi @ashvardanian , apologies for taking a bit of your time
I read through the implementation of jensen shannon in simsimd and compared it to the implementation in scipy.spatial.distance
.
It looks like the bug is we are missing a *result = sqrt(*result)
in the SimSIMD implementation, is that right ?
The code in probability.h
is the following:
#define SIMSIMD_MAKE_JS(name, input_type, accumulator_type, load_and_convert, epsilon) \
SIMSIMD_PUBLIC void simsimd_js_##input_type##_##name(simsimd_##input_type##_t const *a, \
simsimd_##input_type##_t const *b, simsimd_size_t n, \
simsimd_distance_t *result) { \
simsimd_##accumulator_type##_t d = 0; \
for (simsimd_size_t i = 0; i != n; ++i) { \
simsimd_##accumulator_type##_t ai = load_and_convert(a + i); \
simsimd_##accumulator_type##_t bi = load_and_convert(b + i); \
simsimd_##accumulator_type##_t mi = (ai + bi) / 2; \
d += ai * SIMSIMD_LOG((ai + epsilon) / (mi + epsilon)); \
d += bi * SIMSIMD_LOG((bi + epsilon) / (mi + epsilon)); \
} \
*result = (simsimd_distance_t)d / 2; \
}
Checking the scipy implementation, the return value is return np.sqrt(js / 2.0)
For a platform independent square root, there seems to be a simsimd_approximate_square_root
fn:
types.h
491:SIMSIMD_INTERNAL simsimd_f32_t simsimd_approximate_square_root(simsimd_f32_t number) {
but it only takes and returns an f32_t
. Do you think the PR should introduce other variants of this fn, or is there a macro or fn that already exists that can take other num types as input ?
If you could also add a separate PR with a more accurate f64 square root approximation, happy to accept 🤗
I will do that as well !
Describe the bug
There is a discrepancy between SimSIMD and SciPy implementations of Jensen-Shannon divergence, where our kernel doesn't take the square root of the result. So our output is analogous to:
Now that we have platform-specific square roots in SimSIMD, one can make SimSIMD compatible with SciPy.
Steps to reproduce
You can run the test replacing the
baseline_jensenshannon
definition at the top.Expected behavior
Identical to SciPy.
SimSIMD version
v5.5.0
Operating System
Ubuntu 22.04
Hardware architecture
x86
Which interface are you using?
C implementation
Contact Details
No response
Are you open to being tagged as a contributor?
.git
history as a contributorIs there an existing issue for this?
Code of Conduct