Closed brainliu81 closed 5 years ago
@brainliu81 do you have a test case that i can use to debug/fix this? thanks. will take a look into this. apologies for the late response.
This function is used in the jaro
and jaroWinkler
implementations.
Are those implementations not providing you a correct score for a pair of known values?
for example, given the two strings "MARTHA" and "MARHTA" . the jaro score should be 0.944
and the jaro-winkler score ought to be 0.961
. I double-checked most of my test cases using this site here: https://asecuritysite.com/forensics/simstring
def getCommonChars(s1: String, s2: String, halfLen: Int): String = { val commonChars = new StringBuilder() val strCopy = new StringBuilder(s2) var n = s1.length val m = s2.length s1.zipWithIndex.foreach{ case (ch, chIndex) => { var foundIt = false var j = math.max(0, chIndex - halfLen) while (!foundIt && j <= Math.min(chIndex + halfLen, m - 1)) { if (strCopy(j) == ch) { foundIt = true commonChars.append(ch) strCopy.setCharAt(j, '\0') } j += 1 } }} commonChars.toString }