Anush008 / fastembed-rs

Library for generating vector embeddings, reranking in Rust
https://docs.rs/fastembed
Apache License 2.0
264 stars 36 forks source link

fix: Splade sparse vectors should exclude zeros #96

Closed timonv closed 2 months ago

timonv commented 2 months ago

I made an oopsie. is_sign_positive also accounts for +0.00, which is incorrect and resulted in max size (30522) vectors. This fixes the issue.

Anush008 commented 2 months ago

I was just trying to debug this. Haha. Thanks again.

timonv commented 2 months ago

np, I was implementing it and had a hard facepalm. I thought the max amount of values/indices would be equal to the maximum amount of tokens in the batch, but I get varying numbers. Added an assertion that it should be small and all of the numbers should be > 0.

timonv commented 2 months ago

Actually, just doing > 0 is probably better than the weird min value.

timonv commented 2 months ago

@Anush008 Give me a sec I'll rebase and fix it

timonv commented 2 months ago

Should be alright now, I can check later tonight if needed

Anush008 commented 2 months ago

Looks like we're green and valid!

github-actions[bot] commented 2 months ago

:tada: This issue has been resolved in version 3.14.1 :tada:

The release is available on:

Your semantic-release bot :package::rocket: