tonytonyjan / jaro_winkler

Ruby & C implementation of Jaro-Winkler distance algorithm which supports UTF-8 string.
MIT License
192 stars 29 forks source link

Allow compiling the gem with MSVC (which is not C99 compliant) #43

Open jmarrec opened 4 years ago

jmarrec commented 4 years ago

I realize this is probably a corner case (using MSVC to compile a ruby gem), but that's he situation I am in: I have to use MSVC to compile this gem as we have a C++ application that has to be built on MSVC and one thing it does is to create a CLI based on ruby and embedded some gems.

The fun stuff is that MSVC (including the 2019 16.5.1 that I use which is fully up to date) isn't C99 compliant, it's C89 compliant. So in order to be able to build the C extension in this repo, I had to do the following changes:

Edit: Well actually, what a coincidence, I see this PR actually fixes an open issue. Fix #41

jmarrec commented 4 years ago

Benchmarked on linux after changes to see impact of these changes:

develop: 0.097458 First pass: 0.144719 second pass: 0.099472

jaro_winkler (9e1d7a0) 0.097458 0.000000 0.097458 ( 0.097460) jaro_winkler (a742b86) 0.147391 0.000000 0.147391 ( 0.147397) jaro_winkler (22a60aa) 0.099472 0.000000 0.099472 ( 0.099475)

Had to remove 22a60aa (which was a commit where I tried to remove the memset call) as tests were failing.

jmarrec commented 4 years ago

Hum I guess an alternative is to declare the arrays as fixed size, large enough.

char short_codes_flag[1000];
char long_codes_flag[1000];
jmarrec commented 4 years ago

alloca seems to work, but haven't tested on MSVC

jaro_winkler (12dc642) 0.108995 0.000000 0.108995 ( 0.108998)

jmarrec commented 4 years ago

Ok, used alloca instead, back to normal in terms of performance:

before: jaro_winkler (9e1d7a0) 0.097458 0.000000 0.097458 ( 0.097460) after : jaro_winkler (f1ca425) 0.097654 0.000000 0.097654 ( 0.097674)