dice-group / LIMES

Link Discovery Framework for Metric Spaces.
https://limes.demos.dice-research.org/
GNU Affero General Public License v3.0
129 stars 54 forks source link

How to add new Pre-processing function #162

Closed BilalKoteich closed 6 years ago

BilalKoteich commented 6 years ago

I tried to add a new "Pre-processing Function" in the class Preprocessor.java which replaces a set of special characters, but when i use it i get the following warning "Unknown preprocessing function", I added my function ("normalize") in the following code:

`public class Preprocessor {

public static final String DATE             = "date";
public static final String REPLACE          = "replace";
public static final String NORMALIZE        = "normalize";
public static final String REG_EX_REPLACE   = "regexreplace";
public static final String FAHRENHEIT       = "fahrenheit";
public static final String URI_AS_STRING    = "uriasstring";
public static final String REMOVE_BRACES    = "removebraces";
public static final String CELSIUS          = "celsius";
public static final String REGULAR_ALPHABET = "regularAlphabet";
public static final String UPPER_CASE       = "uppercase";
public static final String LOWER_CASE       = "lowercase";
public static final String CLEAN_IRI        = "cleaniri";
public static final String AT               = "@";
public static final String NO_LANG          = "nolang";
public static final String NUMBER           = "number";
static Logger logger = LoggerFactory.getLogger(Preprocessor.class.getName());

public static String process(String entry, String functionChain) {
    String result = entry.split("\\^")[0];
    logger.debug("Function chain = " + functionChain);
    if (functionChain != null) {
        if (!functionChain.equals("")) {
            {
                String split[] = functionChain.split("->");
                for (int i = 0; i < split.length; i++) {
                    result = atomicProcess(result, split[i]);
                    logger.debug(result);
                }
            }
        }
    }
    logger.debug("<"+entry+">" + " -> <" + result+">");
    return result;
}

public static String atomicProcess(String entry, String function) {
    logger.debug(entry +" -> "+ function);
    if (function.length() < 2) {
        return entry;
    }
    //remove unneeded xsd information
    //function = function.toLowerCase();
    if (function.startsWith(LOWER_CASE)) {
        logger.debug(entry +" -> Lowercase");
        return entry.toLowerCase();
    }
    if (function.startsWith(UPPER_CASE)) {
        return entry.toUpperCase();
    }
    if (function.startsWith(NORMALIZE)) {

        entry.replaceAll("X1", "Y1");
        entry.replaceAll("X2", "Y2");
        entry.replaceAll("X3", "Y3");
        return entry;
    }
    if (function.startsWith(REPLACE)) {
        //function = function.replaceAll(Pattern.quote("_"), " ");
        logger.debug(">>>"+function);
        //entry.replaceAll("a", "d");
        String replaced = function.substring(8, function.indexOf(","));
        String replacee = function.substring(function.indexOf(",") + 1, function.indexOf(")"));
        logger.debug("<"+replaced + ">, <" + replacee + ">");
        return entry.replaceAll(Pattern.quote(replaced), replacee);
    }

...

    else {
        logger.warn("Unknown preprocessing function " + function);
        return entry;
    }`

do we have to add something in other classes?

Kleanthi commented 6 years ago

Hey there, have you recompiled your code once you added the new pre-processing function? Also the change you applied will be included in your local branch of LIMES only.

BilalKoteich commented 6 years ago

I imported the project in eclipse, I modified the code then i compiled the project. to add a new pre-processing function .. do i have to change something in other classes ?

dobraczka commented 6 years ago

There was recently a refactoring of the pre-processing package, to make it more modular. I would recommend using the new architecture. You can find this change in the dev branch.

To add a new pre-processing function you should do the following:

Just take a look at the other pre-processing classes and you will see this is all pretty straightforward

BilalKoteich commented 6 years ago

I use now the dev branch. Thank you ! :D