oborchers / Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!
GNU General Public License v3.0
619 stars 83 forks source link

Convert iterating approach for sample_size to closed form expression #38

Closed grantmwilliams closed 3 years ago

grantmwilliams commented 3 years ago

I thought it would be easier to understand the code to get the sample_size if the code just used an algebraic expression instead of the iteration based approach. Plus the closed-form expression should be more efficient.

I also updated vstack([vectors[i] for i in sample_indices]) to just use a logical index from the sample indices instead: vectors[sample_indices, :]. I think it's a little cleaner, but it really comes down to personal choice. 🤷‍♂️

grantmwilliams commented 3 years ago

(If this PR gets merged it might be a good idea to squash & merge since I made a few dumb commits as a part of this).

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging 38c716bfbe93edb48679ff02f2c58da93727ad74 into b6563eb309ea98b67f03184745e197a8fe430ee0 - view on LGTM.com

fixed alerts:

oborchers commented 3 years ago

Hi @grantmwilliams ! Thanks for contributing! I've checked the code and all is fine! This library hasn't seen work in a way too long time and needs a rather intensive overhaul