Closed amathur2k closed 4 years ago
Hi Abhi, In these examples since 1 out of 2 names matches , you should get a 50% match.
So for example if these names are part of a List of name
List<String> sourceString = Arrays.asList("A Mathur", "ABhishek Mathur", "Donald Trump", "D Trump");
We just need to feed the library with a Document with an Element of Name.
AtomicInteger idCount = new AtomicInteger();
List<Document> sourceDoc = sourceString.stream().map(name -> {
return new Document.Builder(idCount.incrementAndGet() + "")
.addElement(new Element.Builder().setType(NAME).setValue(name).createElement())
.setThreshold(0.4)
.createDocument();
}).collect(Collectors.toList());
Map<String, List<Match<Document>>> result = matchService.applyMatchByDocIdOld(sourceDoc);
Note, that each document needs a Key
, you can feed your own unique key for these.
Also we would need to reduce the Document threshold
a little, since by default it considers a matching document greater than 0.5
You should be able to see the match results , using this same print to console
result.entrySet().forEach(entry -> {
entry.getValue().forEach(match -> {
System.out.println("Data: " + match.getData() + " Matched With: " + match.getMatchedWith() + " Score: " + match.getScore().getResult());
});
});
Result
Data: {[{'A Mathur'}]} Matched With: {[{'ABhishek Mathur'}]} Score: 0.5
Data: {[{'ABhishek Mathur'}]} Matched With: {[{'A Mathur'}]} Score: 0.5
Data: {[{'Donald Trump'}]} Matched With: {[{'D Trump'}]} Score: 0.5
Data: {[{'D Trump'}]} Matched With: {[{'Donald Trump'}]} Score: 0.5
Thanks Manish for the detailed response, however i fear reducing the threshold to under 0. will start matching Miachel to Mitchell, and D Trump to J Trump. I am looking at the rosette api's t see how they are doing this. though they dont have code open sourced.
Hi, I am trying to match names such a A Mathur with ABhishek Mathur or Donald Trump with D Trump. Is there some simple parameter i can adjust to allow that ?
Thanks Abhi