Open stbusch opened 4 years ago
Hi! The parent abstract class Formula
defines a few abstract methods: toLispTree
, forEach
, map
, and mapToList
. These are utility functions for traversing Formula
objects, which are tree structures. The method forEach
in particular can be used to perform something recursively in the tree.
However, it looks like the forEach
method is never used by anyone. The more general methods map
and mapToList
have been used a few times.
Thanks for the explanations. However, I can't see yet where map
and mapToList
are used. My IDE (IntelliJ) actually says they aren't.
My approach to understand the code now is going through the JavaExecutor's processFormula
with an example CallFormula. I basically do get how the formula is recursively processed here but it's unclear to me which information is carried in the field Evaluation
of the Response
object that is returned by that method. Any hints? thanks!
The Executor can use the stats
field (Evaluation
object) to store arbitrary statistics about the execution (e.g., how much time it takes to execute). See freebase/SparqlExecutor.java
(Line 253) for an example. I think JavaExecutor does not use this field. The statistics will be gathered across all executed formulas (Parser.java
Line 356) and reported at the end of the run.
I found mapToList
used in niche places like freebase/LexiconFn.java
(Line 265) and overnight/OvernightFeatureComputer.java
(Line 142). I think it's safe to ignore these utility methods for now.
Thanks again.- I'm at the chapter Batch Learning of the tutorial now, trying to understand the corresponding parts of the sempre code. The classes FeatureExtractor
and Rule
refer to Rules being "anchored" or "floating". In your paper "Macro Grammars and Holistic Triggering for Efficient Semantic Parsing" you say
„..., in which logical forms are either triggered by specific phrases (anchored) or can be triggered in any context (floating).“
Does this mean an anchored rule is applicable only if a (sub)span of tokens matches the phrase of the rule's definition exactly (and not just a rhs category) - and any rule without that property is floating? Thanks
The anchored vs floating distinction is only applicable to the FloatingParser. Other parsers (e.g., BeamParser) assume that all rules are anchored.
Derivation
objects representing complete or partial parses. The Rule
s in the grammar specify how larger Derivations can be constructed from smaller ones.cat
egory string (e.g., $TOKEN, $Entity, or $ROOT) and a formula
(a ***Formula
object)start
and end
). A floating Derivation is not bound to a span (start
and end
are set as -1) but instead has a size
(explained later below).(rule $foo (biggest $bar) (...) (anchored 1))
can only be applied if $bar is an anchored Derivation whose anchoring span appears right after the query token "biggest". If the $bar Derivation is anchored to span [i, j], and the token i-1 is "biggest", the newly constructed $foo would be anchored to span [i-1, j].(rule $foo ($bar $baz) (...))
combines any floating Derivations $bar and $baz into a $foo. Note that the (...) part describe how the formula of the $foo Derivation is built from the children formulas. If this construction fails (e.g., when the formulas do not type check), the parent Derivation is not constructed.Thanks for your elaborate explanation!
Now I'm trying to track from the main
method how sempre is setup for interaction.
Apparently the fig/.... supporting packages are used pretty much, but these are not part of sempre and I'm not sure which parts of these are relevant in the particular context of sempre. The main
method indirectly calls the static method Execution.runWithObjArray(String [] args, Object [] objects );
, with objects
containing
("Main", new Main(), Master.getOptionsParser())
. This method starts with the call init(args, objects);
, which for its part calls the register
method of the OptionsParser
class. Here, the interface OptionSet
is used the get the annotations of the fields of the objects in the Array objects
- but it seems in sempre this interface (or a class implementing it) is never used. Is this correct, and if so, can I neglect this register
method in the process of understanding how the setup of sempre for interaction is
implemented? Thanks again.
Main
that are annotated with @Option
will be registered. The Option
annotation (from fig.basic
) and OptionSet
don't interact, so I'm not sure why register
successfully parses these Option
fields either.Master.getOptionsParser
, the OptionParser
also registers options from the classes specified in the module-classes.txt
file (generated during ant compilation). AFAIK this was a hack so that the options of irrelevant packages do not have to be parsed (e.g., the tables.*
classes will not be processed if tables
ant target was unused).
@Option
-annotated fields under the opts
object (defined by public static Options opts = new Options();
in most classes) will be registered.Thanks again. The background of all my questions that I would like to create a plugin for an existing NLP java application that enhances it with sempre functionality. That requires some deeper understanding of the code. As I see, the real entry point to sempre is not the main
method, but the Ruby script run
in combination with execrunner
. I am completely new to Ruby and I wonder if a complete understanding of these scripts is necessary for my goal. Are they basically "just" setting up the environment to pass parameters to the Java code from the command line (so that understanding the Java code would probably be sufficient to me)? I have established that run
basically defines the available modes and then adds them to the global variable $modes
, and maps the modes to their functionality in modesMap
, based on the information stored in $modes
.
The final command run!(sel(:mode, modesMap))
most likely leads to the environment being set up according to the chosen mode, though I failed to track that in detail so far. So, the question is: Do I need to understand the Ruby scripts in detail? (learning Ruby is certainly worthwhile, but time pressure forces me to set priorities). Thanks!
You might not need to understand the ruby script in detail. The ruby script is just a tool for creating a Java command (or multiple commands) to run. From any ruby command in README or TUTORIAL, you can add -n
to print the Java command and exit. That Java command can be invoked directly without using the ruby script.
Thanks again. That will save me some time. Now I'm back at the Batch Learning chapter of the tutorial. There it says
The rule feature domain tells the feature extractor to increment the feature each time the grammar rule is applied in the derivation.
Could you please tell me if I understand the following points correctly:
extractRuleFeatures
in class FeatureExtractor
must play a main role here but I don't see yet how it counts all rule applications in the derivation tree.
Thanks again!rule
features of a derivation tree are the rules used when building that tree. (the feature names = the rules serialized as strings).rule
features are present. The feature vector is implemented as a Map-like object (from feature names to feature values), so the size can be increased when new features are discovered.Regarding rule computation: while one could freshly compute the rule
features for each new derivation, the code does it recursively: to compute the rule
features for derivation D, combine the rule features of the child derivations of D, then add the rule R applied when constructing D. The localFeatureVector
field of D only stores this final rule feature R. During scoring or gradient updates, the features will be gathered recursively.
Thanks again, ppasupat. While I get the idea, I can't seem to find where this recursive gathering of features and calculating of scores is actually located in the code. The class FeatureExtractor
contains the method extractLocal
which apparently calls the respective extraction method for each activated feature. The comment here says: This function is called on every sub-Derivation,...
, but where does that happen? I found that extractLocal
is used by the ParserState
class and its subclasses within the method featurizeAndScoreDerivation
, but I don't see the recursion that takes the whole derivation tree into account. - The method computeScore
in the class Derivation
does recursively calculate all scores, but why is it used only the ReinforcementParser
? - I would like to gain some more insight here. Thanks!
Background
The method parse
in Parser
is responsible for parsing an utterance. It calls ParserState state = newParserState(params, ex, computeExpectedCounts)
and then state.infer()
to construct the Derivations. These two methods should be overridden in subclasses of Parser
. The state.infer()
call should populate the predDerivations
field of state
with the final Derivations for the example.
(Note: computeExpectedCounts
means whether to compute the information necessary for gradient updates. The flag is turned on during training and off during evaluation.)
Let's use BeamParser
as an example. Ignore the coarse{State,Prune}
stuff (and whenever BeamParserState.mode == bool
) which is only for speeding up parsing.
The newParserState
method constructs a new BeamParserState
object (which is in the same file BeamParser.java
). The infer
method of BeamParserState
constructs Derivations over utterance spans of increasing sizes using the build
method, and then call setPredDerivation
(defined in the parent class ChartParserState
) to populate the predDerivations
field.
Tracing the method calls from build
leads to the applyRule(start, end, rule, children)
method, which constructs Derivations from the given children using the given grammar rule. (Ignore the mode == bool
section.) The produced derivations newDeriv
are featurized and scored by featurizeAndScoreDerivation
(defined in ParserState
). This method calls extractLocal
to extract features and then computeScoreLocal
to compute the score.
Feature extraction and scoring
Feature extraction and scoring are actually not done recursively (sorry for my confusion earlier). Rather, it uses dynamic programming. Let's say a derivation A has one child B, A has local features {a1, a2}, and the total score of all features in B (including the features of its descendants) is already computed and stored in B.score. Then the total score of A, A.score, can be computed as A.score = params.getWeight(a1) + params.getWeight(a2) + B.score.
The extractLocal
method (in FeatureExtractor
) extracts features only for the current derivation and not the children. For example, the "rule" feature (in extractRuleFeatures
) will define only one feature for the rule used to build the derivation. The rules used for building the children should have already been extracted when the children were constructed (and featurizeAndScoreDerivation
was called on them).
The computeScoreLocal
method (in Derivation
) computes score
(total score of all features) = total score for local features + sum of the score
from children. Like the features, the score
field of the children should already have been populated when the children were constructed.
In ReinforcementParser
, there is no guarantee that the score of children derivations have been populated before the parent is constructed (I think), so the computeScore
is used instead of computeScoreLocal
to recursively score the children.
Note that this dynamic programming method only works for linear models (which SEMPRE uses). A model that consider interactions between features would not work.
Thank you! With your explanations and further studying of the code I now believe to have understood the following:
the field chart
of the class (Beam)ChartParserState
keeps track of the so far
detected derivations. It does so by maintaining a two dimensional map. The two
indices stand for start
and end
of a (sub)span. The Key is the name of a
(target) category. The Value is a list of those derivations that have been found
for that particular span with that particular category.
The infer
method of BeamParserState
first finds the "primitive" derivations,
that are based only on Tokens and Phrases. To achieve this,
gatherTokenAndPhraseDerivations
is called for each subspan of the given span.
The results are added to chart
.
infer
then calls build on subspans of increasing length, respectively for all
possible start indices within the span, from left to right.
For each of these subspans, and in their thus determined order, build
calls applyCatUnaryRules
and applyNonCatUnaryRules
. These try to apply all
available rules to the respective subspan. Everytime a rule is applicable to
existing dervations in chart
(which is, in case of applyCatUnaryRules
, because
its rhs
matches the category these derivations are mapped to in chart
for the
respective subspan), applyRule
is called. This method executes the rule that
was found applicable, creates a new derivation and puts that one in chart
.
(Ok, for applyNonCatUnaryRules
it is a bit more complex, but I think I got
the principle how the most important methods interact...). So for each subspan
build is called on, the derivations created on previous subspans can be used to
potentially apply rules on.
applyRule
also extracts the features of the newly added derivation. It then
computes the "total" score of that derivation as the sum of its
weighted local features and the scores of its children.
finally, infer
callssetPredDerivations
. This puts alls derivations that go over
the whole original span and have category $Root
(and so qualify to be
returned to the user) into the field predDerivations
of BeamParserState
.
Please correct me if I'm wrong.
Yes, these are all correct. Different parsers might construct logical forms in different orders (e.g., the second stage of FloatingParser does not care about start
and end
; the ReinforcementParser learns to pick which partial Derivation to expand on; etc.), but the general ideas about applying rules, scoring, and then collecting predDerivations
should be the same.
Thanks. Next thing will be to investigate the implementation of the math behind it, the actual calculation of parameters and probabilities.
So I'm trying to get to the core of the math behind the lerning process. I've brushed up my knowledge on Stochastic Gradient Descent, but have problems mapping it to the code. The main method here is apparently learn
in the Learner
class. It calls processExamples
, which uses parseExample
, checkGradient
and updateWeights
(which calls update
in class Params
).
The names suggest that the actual updating happens in these last two mentioned. But in the checkGradient
method there is the line
perturbedParams.getWeights().put(feature, perturbedParams.getWeight(feature) + eps);
which looks like the weights are updated here (but with the constant eps
and not with a calculated SGD step). Also, what part of the SGD algorithm do the fields objectiveValue
and expectedCounts
of the ParserState
class correspond to, that are used by checkGradient
? Could you please provide some hints about the ML implementation? Thanks
Objective function and gradient
Derivation.computeScore(params)
How all this is done in the code
ParserState.computeExpectedCounts
Learner.processExamples
:
computeExpectedCounts
(boolean) indicates whether the weights should be updatedexpectedCounts
in the ParserState are gathered in the counts
map until a certain number of examples (specified by batchSize
) is processed. Then updateWeights(counts)
is called on the accumulated expected counts (after which counts
is reset).updateWeights(counts)
essentially calls params.update(counts)
. Ignoring L1-regularization, the function just does MapUtils.incr(weights, f, stepSize * g);
. This means adding (expected count of f * learning rate) to the weight of f.checkGradient
part. It is used to check if the gradient update is correct. It does not actually update the weights.A question about how to tell a FeatureExtractor
which features it is supposed to use: I understand these features are specified in the command line using -FeatureExtractor.featureDomains......
. But where in the code are these actually added to the HashSet featureDomains
in the class Options
of the FeatureExtractor
?
Is it save to say, that for parsing relatively simple utterances into logical forms that are meant to be executed by JavaExecutor
mostly the BeamParser
Class and the Rule
Feature are relevant?
Using the BeamParser should be sufficient for the parser. For the features, the rule
features alone might be insufficient (since it only counts how many times each rule is used, regardless of the input utterance). The geo880 mode (for Geoquery dataset) lists the following features:
rule opCount constant whType span lemmaAndBinaries denotation lexAlign joinPos skipPos
(Line 1093 in the run
script). This might be a good set of features to try (though some of which are not guaranteed to work correctly based on the grammar and executor used).
Thanks for the hint. I was asking because I'd like to offer the user a reasonable selection of Features to choose from via checkboxes in my GUI. I'd also like to add a short description to of each Feature to the GUI.
The "constant" Feature seems to be located in the class ConstantFn
directly. Am I right that the code segment
res.addFeature("constant", ex.phraseString(c.getStart(), c.getEnd()) + " --- " + formula.toString());
registers appliance of ConstantFn
per subspan? Also, which part of the code does actually trigger that registration? It seems that there is no method in FeatureExtractor
taking care of the "constant" Feature.
Feature firing in SEMPRE is pretty complicated due to legacy reasons.
FeatureExtractor
gives an overview of the 3 ways features can be defined.FeatureExtractor.featureDomains
.SemanticFn
(like the example you gave for ConstantFn
) will be defined once the SemanticFn
is called (i.e., when building the Derivation).
ConstantFn
, the feature is defined on every subspan the corresponding rule fires. For example, (rule $TypeNP (meeting) (ConstantFn en.meeting))
detects the word "meeting" and create a Derivation of category $TypeNP
with formula en.meeting
. The res.addFeature(...)
line you mentioned will register a "constant" feature during the creation of the Derivation (res
).ParserState.featurizeAndScoreDerivation(deriv)
is called. The FeatureExtractor
goes through all features defined locally in the FeatureExtractor
plus any registered FeatureComputer
classes.I'm back at investigating the features. whType
is using Part-of-Speech tagging.
As far as I managed to trace it, the method preprocess
in the Example
class populates the field posTags
inn the field languageInfo
by applying a LanguageAnalyzer
. I'd like to know how it is determined which POS tags are actually added here and if it is possible to change (add/remove) POS tags manually? thanks!
The types of POS tags depend on the LanguageAnalyzer
used (specified by the flag --LanguageAnalyzer.languageAnalyzer
). The available choices are SimpleAnalyzer
(which is a dummy analyzer that only recognizes numbers) and corenlp.CoreNLPAnalyzer
(which runs CoreNLP).
The POS tags returned by CoreNLP are the based on the model used, with the default using Penn Treebank tags. The class CoreNLPAnalyzer postprocesses some of the tags (e.g., marking auxiliary verbs), and more postprocessing could be hacked in there.
Thanks. I will think over if I need this feature. It's probably more relevant for database queries than for the cases the JavaExecutor
handles.
Shouldn't it be possible to map Strings to categories, e.g.
(rule $City (string "hamburg") (IdentityFn))
(rule $City (string "stuttgart") (IdentityFn))
(rule $City (string "munic") (IdentityFn))
(rule $ROOT ($City) (IdentityFn))
as a part/base of a more complex grammar?
In my application, using these rules, these city names aren't recognized. Is the grammar itself wrong? Otherwise it would mean, my application doesn't handle the grammar file correctly. Thanks
Could you try changing the rules to look like:
(rule $City (hamburg) (IdentityFn))
I had tried this; I then get:
ERROR: Composition failed: rule = $City -> hamburg (IdentityFn), children = []
java.lang.ArrayIndexOutOfBoundsException
Exception in thread "Thread-374" java.lang.RuntimeException: java.lang.ArrayIndexOutOfBoundsException
at edu.stanford.nlp.sempre.BeamParserState.applyRule(BeamParser.java:165)
at edu.stanford.nlp.sempre.BeamParserState.applyNonCatUnaryRules(BeamParser.java:224)
at edu.stanford.nlp.sempre.BeamParserState.applyNonCatUnaryRules(BeamParser.java:231)
at edu.stanford.nlp.sempre.BeamParserState.build(BeamParser.java:123)
at edu.stanford.nlp.sempre.BeamParserState.infer(BeamParser.java:98)
at edu.stanford.nlp.sempre.Parser.parse(Parser.java:170)....
(rest of the error message is referring to classes in my application)
What apparently works is:
(rule $City (hamburg) (ConstantFn (string hamburg)))
but that feels quite cumbersome.
The current grammar looks like this:
(rule $Dir (from) (ConstantFn (lambda x (call + (string "place of departure") (string :)(var x)))))
(rule $From ($Dir $City) (JoinFn forward))
(rule $Dir (to) (ConstantFn (lambda x (call + (string "arrival location") (string :)(var x)))))
(rule $To ($Dir $City) (JoinFn forward))
(rule $Part (by) (ConstantFn (lambda x (call + (string "transportation") (string :)(var x)))))
(rule $By ($Part $Transport) (JoinFn forward))
(rule $Transport (plane) (ConstantFn (string plane)))
(rule $Transport (train) (ConstantFn (string train)))
(rule $City (hamburg) (ConstantFn (string hamburg)))
(rule $City (stuttgart) (ConstantFn (string stuttgart)))
(rule $City (frankfurt) (ConstantFn (string frankfurt)))
(rule $City (munic) (ConstantFn (string munic)))
(rule $City (hannover) (ConstantFn (string hannover)))
(rule $ROOT ($From) (IdentityFn))
(rule $ROOT ($To) (IdentityFn))
(rule $ROOT ($By) (IdentityFn))
and manages to parse e.g. "from munic" into "(string "place of departure:munic")
(rule $ROOT ($From $By) (ConcatFn ""))
and entering "from munic by train"
returned
(string nullnull).
Is there a way? thanks!(rule $ROOT ($From $By) (lambda x (lambda y (call + (var x) (var y)))))
I think ConcatFn can only concatenate raw strings, not results of (call ...)
.
Thanks. With your hint, I have the grammar now set up that way:
(rule $Dir (from) (ConstantFn (lambda x (call + (string "from") (string ": ")(var x)))))
(rule $From ($Dir $City) (JoinFn forward))
(rule $Dir (to) (ConstantFn (lambda x (call + (string "to") (string ": ")(var x)))))
(rule $To ($Dir $City) (JoinFn forward))
(rule $Part (by) (ConstantFn (lambda x (call + (string "by") (string ": ")(var x)))))
(rule $By ($Part $Transport) (JoinFn forward))
(rule $Transport (plane) (ConstantFn (string plane)))
(rule $Transport (train) (ConstantFn (string train)))
(rule $City (hamburg) (ConstantFn (string hamburg)))
(rule $City (stuttgart) (ConstantFn (string stuttgart)))
(rule $City (frankfurt) (ConstantFn (string frankfurt)))
(rule $City (munic) (ConstantFn (string munic)))
(rule $City (hannover) (ConstantFn (string hannover)))
(rule $ROOT ($From) (IdentityFn))
(rule $ROOT ($To) (IdentityFn))
(rule $ROOT ($By) (IdentityFn))
(rule $ROOT (($PHRASE optional) $From ($PHRASE optional) $By ($PHRASE optional)) (lambda x (lambda y (call + (var x) (string ", ") (var y)))))
(rule $ROOT (($PHRASE optional) $From ($PHRASE optional) $To ($PHRASE optional)) (lambda x (lambda y (call + (var x) (string ", ") (var y)))))
(rule $ROOT (($PHRASE optional) $By ($PHRASE optional) $To ($PHRASE optional)) (lambda x (lambda y (call + (var x) (string ", ") (var y)))))
(rule $ROOT (($PHRASE optional) $By ($PHRASE optional) $From ($PHRASE optional)) (lambda x (lambda y (call + (var x) (string ", ") (var y)))))
(rule $ROOT (($PHRASE optional) $To ($PHRASE optional) $From ($PHRASE optional)) (lambda x (lambda y (call + (var x) (string ", ") (var y)))))
(rule $ROOT (($PHRASE optional) $To ($PHRASE optional) $By ($PHRASE optional)) (lambda x (lambda y (call + (var x) (string ", ") (var y)))))
(rule $ROOT (($PHRASE optional) $To ($PHRASE optional) $By ($PHRASE optional) $From ($PHRASE optional)) (lambda x (lambda y ( lambda z (call + (var x) (string ", ") (var y) (string ", ") (var z)))))
)
(rule $ROOT (($PHRASE optional) $To ($PHRASE optional) $From ($PHRASE optional) $By ($PHRASE optional)) (lambda x (lambda y ( lambda z (call + (var x) (string ", ") (var y) (string ", ") (var z)))))
)
(rule $ROOT (($PHRASE optional) $From ($PHRASE optional) $To ($PHRASE optional) $By ($PHRASE optional)) (lambda x (lambda y ( lambda z (call + (var x) (string ", ") (var y) (string ", ") (var z)))))
)
(rule $ROOT (($PHRASE optional) $From ($PHRASE optional) $By ($PHRASE optional) $To ($PHRASE optional)) (lambda x (lambda y ( lambda z (call + (var x) (string ", ") (var y) (string ", ") (var z)))))
)
(rule $ROOT (($PHRASE optional) $By ($PHRASE optional) $To ($PHRASE optional) $From ($PHRASE optional)) (lambda x (lambda y ( lambda z (call + (var x) (string ", ") (var y) (string ", ") (var z)))))
)
(rule $ROOT (($PHRASE optional) $By ($PHRASE optional) $From ($PHRASE optional) $To ($PHRASE optional)) (lambda x (lambda y ( lambda z (call + (var x) (string ", ") (var y) (string ", ") (var z)))))
)
The idea is to extract the main info from a passenger's utterance, who wants to order a ticket. There is still room for misunderstanding, e.g. "....preferably not by train.... " would still be parsed as "by train", but I'm planning to take care of this by sempre's learning capabilities.
About learning: The handleCommand
method in the Master
class contains this
sequence:
if (command.equals("accept") || command.equals("a")) {
ex.setTargetFormula(response.getDerivation().getFormula());
ex.setTargetValue(response.getDerivation().getValue());
ex.setContext(session.getContextExcludingLast());
addNewExample(ex);
}
The format created this way apparently can't be read with the Dataset - inPaths
option (calling the readFromPathPairs
method). Of course I could change the format of the exported examples and re-use them, but it seems that is not the way intended by you sempre authors. So to re-use learning achievements in future sessions, should I go by reimporting the trained parameters? Thanks!
Importing the trained parameters is the intended way. The params are saved in the experiment directory after training in the normal batch learning mode (where you supply the dataset files).
From the interactive prompt mode (what you get when using the -interactive
flag), after parsing a few examples and accepting a few candidates, you can also use the (params FILENAME_HERE)
command to dump the current parameters to a file.
Thanks. In your tutorial, you use the example
What is three plus four times two?
and point out:
There should be two derivations, yielding (number 14) and (number 11), corresponding to either combining three plus four first or four times two first. Note that this is expected because we have not encoded any order of operations anywhere.
Now please assume that I'd like sempre to learn the order of operations (multiplication having precedence over addition). What would be a recommendable feature representation here? The Rule
feature would probably not be sufficient, because that would represent just the number of rule applications but not their order.
...or maybe the Rule
Feature could work with more sophisticated rules that can distinguish between e.g. sums of products and products of sums so that learning would have to encourage sums of products....? No clear plan yet. Inspiration would be appreciated.
To learn the order or operations, I think a more complex feature needs to be defined (e.g., depth of the derivation tree, or a binary feature indicating if the "add" rule was used in any child of the "multiple" rule). Such an ambiguity doesn't often occur in factoid questions, so there aren't features dealing with it.
For some of the intended use cases I'd like to create and export training examples (instead of params) in a format that makes them reusable for training. The training example from your tutorial had this format:
(example
(utterance "three and four")
(targetValue (number 7))
)
If I have several examples in the same file, are there additional rules for correct fromatting to make them readable? E.g. seperators between the examples? Would this format be ok?
(example
(utterance "three times six plus nine")
(targetValue (number 27))
)
(example
(utterance "three times five plus seven")
(targetValue (number 22))
)
(example
(utterance "three times seven plus eight")
(targetValue (number 29))
)
The additional spaces should be fine. Comments can also be added: begin the comment with #
.
Thanks for this confirmation. I'm back at the earlier example of a passenger ordering a ticket I would like to have the parsing result as a list of strings, e.g.
(list (string hamburg) (string munic) (string plane))
or
(list (string "from hamburg") (string "to munic") (string "by plane"))
Is there a way (a function) to write rules that create a list from non-list type of values?
Probably I should try to write a sempre function that does this, but am a bit out of practice when it comes to coding. Would it have to look approximately like this:
public class ListFn extends SemanticFn {
private ListValue listValue;
public ListFn (ListValue listValue){this.listValue=listValue;};
@Override
public DerivationStream call(Example ex, Callable c) {
return new SingleDerivationStream() {
@Override
public Derivation createDerivation() {
for (int i = 0; i < c.getChildren().size(); i++) {
listValue.values.add(c.getChildren().get(i).getValue());
}
return new Derivation.Builder()
.withCallable(c)
.formula(new ValueFormula<>(listValue))
.createDerivation();
}
};
}
}
? Also, would I have to add this class to the sempre code (locally), or is there a way sempre could use it from within my application? So far, my application could successfully use all sempre classes it needed, but when I try to use a grammar that contains rules with this new function, I get a ClassNotFoundException. Thanks for any help.
The code you have looks about right. New SemanticFn should be added to the SEMPRE code locally. SEMPRE uses reflection to resolve the name of the SemanticFn, so the SemanticFn has to be inside the SEMPRE library.
Thanks. I'm back at trying to get a better understanding of the Features you listed earlier. Probably for simple subjects like the ticket order I won't need them all, but still: What does the skipPos
feature represent?
Looks like skipPos is defined here. It's only defined on SelectFn, which selects a certain child as the output. It seems like skipPos looks at the children that were not chosen, and combine the part-of-speech of the tokens under them.
Not sure how helpful this feature is in general. This seems pretty task-specific.
Thanks. I'm trying to apply a grammar using the ListFn
class (I had posted before) in my application. While grammars just using functions that you had included in sempre do work fine, the compiler in this case gives:
... Caused by: java.lang.InstantiationException: edu.stanford.nlp.sempre.ListFn ... Caused by: java.lang.NoSuchMethodException: edu.stanford.nlp.sempre.ListFn.
() So I have to add the
init
method. I tried to understand its purpose by looking at its usage in the otherSemanticFn
subclasses: I changed my code to this:public class ListFn extends edu.stanford.nlp.sempre.SemanticFn { private List list;
public ListFn(List list) {
this.list = list;
}
;
@Override
public DerivationStream call(Example ex, Callable c) {
return new SingleDerivationStream() {
@Override
public edu.stanford.nlp.sempre.Derivation createDerivation() {
return new Derivation.Builder()
.withCallable(c)
.formula(new ValueFormula<>(new ListValue(list)))
.createDerivation();
}
};
}
public void init(LispTree tree) {
super.init(tree);
list = new ArrayList();
for (int i = 0; i < tree.children.size(); i++) {
list.add(tree.child(i).value);
}
}
}
Does this look reasonable? Thanks
This looks reasonable, though you should start the for loop (in init
) from 1 instead of 0. The input tree
looks like (ListFn foo bar)
, and you don't want to add "ListFn"
to list
.
Thanks! I'm back a t the features. I'm afraid I could get an intuitive understanding of what they do/represent only for the rule
and the constant
feature. Is there an official desription for the others available in the way the rule
feature was explained in the tutorial?
I don't think it's documented anywhere. When I was using SEMPRE (mostly the tables package) I defined my own features and didn't use any of the default ones. That said, I had some partial notes on what each feature does here:
Thanks for explaining!
I'm at the moment back at the learning algorithm.
When you said here: https://github.com/percyliang/sempre/issues/217#issuecomment-724443027
Define p(deriv) = exp[score(deriv)] / sum_deriv' exp[score(deriv')]
did you mean the "softmax function" https://en.wikipedia.org/wiki/Softmax_function ?
Yes, that's a softmax over the scores.
Thanks.- In the internal handling of the machine learning algorithms, is there an any concept of negative training examples? Or does any result not "accepted" by the user implicitly count as negative?
Hi, I'm trying to get some deeper insight in the code of sempre. For the moment, I'm looking at the simple Java logical forms and their execution by the class JavaExecutor- The class CallFormula features the method "forEach". I can't see its exact purpose. Intuitively, I think it manages the recursive decomposition of a more complex CallFormula, and I guess the Boolean returned by func says if the base case of the recursion is reached. But again, I don't get exactly how. Also it seems that this foreach method is not even used in the JavaExecutor's methods execute and processFormula, that apparently are responsible for the execution. Can someone help me to understand the logic behind all this please? Thanks!