Closed iDmple closed 9 years ago
The cleaner shouldn't be called at all if the value is null. Can you post a traceback?
I guess the TrimCleaner should be ported to the main Duke codebase as well. I can do that.
OK, in the meantime I have found the problem. I think it would be great to throw an exception or something to avoid that the cleaner is called with an invalid input.
Here is the code that fails:
private void readFields(List<Map<String, Object>> fields, Map<String, Collection
There the fields come directly from the request and could be empty or null.
I replaced it with:
private void readFields(List<Map<String, Object>> fields, Map<String, Collection<String>> props) {
for (Map<String, Object> value : fields) {
String field = (String) value.get("field");
String fieldValue = value.get("value").toString();
for (Cleaner cl : (ArrayList<Cleaner>) entityParams.get(field).get(CLEANERS)) {
if(fieldValue!=null && !fieldValue.equals("")) {
fieldValue = cl.clean(fieldValue);
}
}
props.put(field, Collections.singleton(fieldValue));
}
}
I made lots of other changes in the plugin because it fails from everywhere with NullPointerException and cast problems. Will do a pull request when my project is ready and I find no more errors.
I think that duke API should be used and not call internal functions like this. It would be better to have elasticsearch as a datasource in fact.
As far as I can tell, this is all issues with the plugin, and not with the Duke code, so I am going to close this.
I agree about the data source.
Hi Lars,
I'm using TrimCleaner from Yann's plugin for entity resolution.
public class TrimCleaner implements Cleaner {
public String clean(String value) { value = value.trim(); if (value.equals("")) return null; return value; }
}
Some of my input strings are emtpy "", thus TrimCleaner returns null which provokes a NullPointerException. Why not return "" when an empty string is trimmed?
How can I avoid this problem?
Best regards