percyliang / sempre

Semantic Parser with Execution
Other
828 stars 301 forks source link

How to use SEMPRE in a Java project? #186

Open dsx4602 opened 5 years ago

dsx4602 commented 5 years ago

Hi, The SEMPRE can be tested in an interactive prompt or on a web interface, but I want to receive a natural-language question and get the SPARQL answer in a Java project. Is there any toturial about how to utilize SEMPRE by writing a Java program?

ppasupat commented 5 years ago

Hi! While we don't have an explicit tutorial for embedding SEMPRE in another project, the Builder object should be a self-contained unit for the parser.

Please refer to the handleUtterance method of the interactive session code. It builds an Example object based on the utterance, parses it, and reads out the answers from the Response object.

Note that Example.Builder (for creating Example objects) is not the same as Builder (the builder field supplied in the initializer).

ppasupat commented 5 years ago

Another note: the options are currently global, and you need to set the correct values (e.g., using Builder.Options. inParamsPath = ...) before creating the object (Builder b = Builder(); b.build();).

dsx4602 commented 5 years ago

Hi! While we don't have an explicit tutorial for embedding SEMPRE in another project, the Builder object should be a self-contained unit for the parser.

Please refer to the handleUtterance method of the interactive session code. It builds an Example object based on the utterance, parses it, and reads out the answers from the Response object.

Note that Example.Builder (for creating Example objects) is not the same as Builder (the builder field supplied in the initializer).

Thank you for your advice, now I can answer "What is three plus four times two?" using Java codes. But for KBQA, I'm confused about how to set the mode (e.g. simple-freebase-nocache), sparqlserver (i.e. SPARQL endpoint URL), SimpleLexicon.inPaths, languageAnalyzer, etc. in the Java codes. Could you help me?

ppasupat commented 5 years ago

Usually these modes are equivalent to (1) invoking a certain main Java class, and (2) adjusting the options.

If you add -n to the command line, the command will print out the full Java command, which will include the main class, the options and their values. Alternatively, after running a command, look at state/execs/___.exec/options.map, which will list all the options and their values.

dsx4602 commented 5 years ago

Usually these modes are equivalent to (1) invoking a certain main Java class, and (2) adjusting the options.

If you add -n to the command line, the command will print out the full Java command, which will include the main class, the options and their values. Alternatively, after running a command, look at state/execs/___.exec/options.map, which will list all the options and their values.

Thank you very much! Now I can use SEMPRE in my own Java project.

retypepassword commented 5 years ago

I thought I'd share the wrapper I wrote for setting up the parser to use CoreNLPAnalyzer, read some examples, run the learning algorithm, and parse some text. Hope it's helpful to someone else reading this issue.

Parser.java

import edu.stanford.nlp.sempre.*;
import edu.stanford.nlp.sempre.corenlp.CoreNLPAnalyzer;
import fig.basic.Pair;

import java.util.*;
import java.util.stream.Collectors;

public class Parser {
    private Builder builder;
    private Dataset dataset;
    private Grammar grammar;
    private LanguageAnalyzer analyzer;

    Parser(LanguageAnalyzer analyzer) {
        this.builder = new Builder();
        this.dataset = new Dataset();
        this.grammar = new Grammar();
        this.analyzer = analyzer;

        // Equivalent command line option: -languageAnalyzer corenlp.CoreNLPAnalyzer
        // if `this.analyzer` is `new CoreNLPAnalyzer()`
        LanguageAnalyzer.setSingleton(this.analyzer);
    }

    public Parser() {
        this(new CoreNLPAnalyzer());
    }

    // Equivalent command line option: -Grammar.inPaths [grammarPath]
    public void setGrammarPath(String grammarPath) {
        grammar.read(grammarPath);
        builder.grammar = grammar;
    }

    // Equivalent command line option: -Dataset.inPaths train:[examplePath]
    public void setExamplePath(String examplePath) {
        dataset.readFromPathPairs(Collections.singletonList(new Pair<>("train", examplePath)));
    }

    public void initialize() {
        builder.buildUnspecified();
    }

    public void learn() {
        // Equivalent command line option: -FeatureExtractor.featureDomains rule
        FeatureExtractor.Options o = new FeatureExtractor.Options();
        o.featureDomains = Collections.singleton("rule");
        FeatureExtractor.opts = o;
        FeatureExtractor f = new FeatureExtractor(builder.executor);

        // Equivalent command line option: -Learner.maxTrainIters 3
        Learner.opts.maxTrainIters = 3;
        Learner learner = new Learner(builder.parser, builder.params, dataset);
        learner.learn();
    }

    // Parse with SEMPRE. Copied from handleUtterance().
    public Response parse(String query) {
        Example.Builder b = new Example.Builder();
        b.setId("session:1");
        b.setUtterance(query);
        Example ex = b.createExample();
        Response response = new Response(builder);

        ex.preprocess();

        // Parse!
        builder.parser.parse(builder.params, ex, false);
        response.ex = ex;
        response.candidateIndex = 0;

        return response;
    }
}

Usage:

        // We can use SimpleAnalyzer instead of CoreNLPAnalyzer (default when you run
        // the `run` script in SEMPRE is SimpleAnalyzer; default for the sample class above
        // is CoreNLPAnalyzer)
        Parser parser = new Parser(new SimpleAnalyzer());

        // Load grammar
        parser.setGrammarPath("arithmetic-tutorial.grammar");

        // Load training examples
        parser.setExamplePath("arithmetic-tutorial.examples");

        // Must call initialize before learning or parsing
        parser.initialize();

        // Learn from training examples
        parser.learn();

        // Unambiguous query (two plus four means 2 + 4, which is 6, and we expect only 1 prediction)
        Response resp = parser.parse("two plus four");
        assertEquals("(number 6)", resp.getAnswer());

I copied the Response class from edu.stanford.nlp.sempre.Master to its own file in my own package to get the parse(String query) method to work, but the parse method could just as easily have returned an Example, and that wouldn't have been necessary.

stbusch commented 4 years ago

Thanks, but where does this constructor come from?

Response response = new Response(builder);

Sorry if I'm missing something.

retypepassword commented 4 years ago

Sorry, I forgot that I modified the class a little. Here's my full Response class:

import edu.stanford.nlp.sempre.Builder;
import edu.stanford.nlp.sempre.Derivation;
import edu.stanford.nlp.sempre.Example;

import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;

// Copied from edu.stanford.nlp.sempre.Master
public class Response {
    // Example that was parsed, if any.
    public Example ex;
    private Builder builder;

    // Which derivation we're selecting to show
    int candidateIndex = -1;

    // Detailed information
    public Map<String, Object> stats = new LinkedHashMap<>();
    public List<String> lines = new ArrayList<>();

    public String getFormulaAnswer() {
        if (ex.getPredDerivations().size() == 0)
            return "(no answer)";
        else if (candidateIndex == -1)
            return "(not selected)";
        else {
            Derivation deriv = getDerivation();
            return deriv.getFormula() + " => " + deriv.getValue();
        }
    }
    public String getAnswer() {
        if (ex.getPredDerivations().size() == 0)
            return "(no answer)";
        else if (candidateIndex == -1)
            return "(not selected)";
        else {
            Derivation deriv = getDerivation();
            deriv.ensureExecuted(builder.executor, ex.context);
            return deriv.getValue().toString();
        }
    }
    public List<String> getLines() { return lines; }
    public Example getExample() { return ex; }
    public int getCandidateIndex() { return candidateIndex; }

    public Derivation getDerivation() {
        return ex.getPredDerivations().get(candidateIndex);
    }

    public Response(Builder b) {
        this.builder = b;
    }
}
stbusch commented 3 years ago

Thank you!

stbusch commented 3 years ago

What is this referring to:

this.repository = repository;

?

thanks

retypepassword commented 3 years ago

@stbusch Looks like I missed some code removals when I did the copypasta. I've updated my comment to remove that line. It was referring to a Hibernate/JPA repository.

stbusch commented 3 years ago

@retypepassword : thanks for the quick answer. Could you tell me if I get the following points correctly:

thanks

retypepassword commented 3 years ago

@stbusch Yes, I suppose so. To point two, yes, but I think you'd have to change it right before creating the new parser. See comment above: https://github.com/percyliang/sempre/issues/186#issuecomment-424089685

stbusch commented 3 years ago

@retypepassword Have you tried/managed to use CoreNLPAnalyzer instead of SimpleAnalyzer in your Java project?