joshuabowers / pynt

An interactive fiction hub
1 stars 0 forks source link

Command Parser: Complex Sentences #19

Open joshuabowers opened 12 years ago

joshuabowers commented 12 years ago

The current command parser is incredibly simplistic: it only support an action/object pair, assumes that the action is a single word (rather than potentially more complex), does not allow the inventory to be used with the actionable object, etc. It would be nice to support a slightly more robust set of commands.

joshuabowers commented 11 years ago

This is a rather fascinating problem, given that it does, to an extent, require a certain amount of natural language processing. I could, of course, hack out something homebrew, which would inevitably be nowhere near correct, robust, or necessarily functional. So, going with some prebuilt libraries and gems would be really nice.

A quick search suggests that either treat or some combination of linguistics, linkparser, and WordNet would be potentially viable options.

Linkparser was (relatively) easy to get installed into the pynt environment; while it is easy enough to use to parse sentences, it seems rather limited, some of which is due to underlying limitations in link-grammar. Linkparser is robust enough to find the nouns of a given sentence, the (principle) verb of the sentence, the subject and object of the sentence. Of these, subject is not useful, as user-entered commands are expected to be in the imperative mood. However, using linkparser to extract out a verb and object would be useful. What is more questionable is how to go about extracting out adjectives and adverbs for differentiating widgets. The main worry, though, regards extracting indirect objects out of a sentence, in the event that the use of such is important for properly handling the action being conducted. (E.g., it is of no use to me to be able to parse "Open the red door with the bronze key." if the object "bronze key" is not easily specifiable.) Linkparser has a concept called "linkages", which is probably where the answer lies.

Linguistics looks interesting, though I have not played with it. It might actually be a good package for supplementing or supplanting some of the inflection oriented work that ActiveSupport offers. It also makes parsing sentences even easier when paired with Linkparser.

WordNet looks like it might be a goldmine waiting to happen, if properly utilized. In point of fact, it almost looks like it could be used in lieu of #14: utilize WordNet to look up related words for given verbs and nouns extracted via linkparser, which could then make pynt explosively powerful. Only problem, as a minor quibble, is that the linked gem depends upon a SQL library version of WordNet, rather than an online, flatfile, or NoSQL variant. While that does not make this a nonstarter, it does decrease my willingness to utilize it. On the other hand, I could, potentially, build a local lookup model in Mongoid by querying WordNet whenever a room is saved on the relevant words. So, questions!

I have done nothing but a cursory glance at treat. It looks more robustly feature rich compared to the latter three libraries, but I don't know exactly what its dependencies are.

joshuabowers commented 11 years ago

After spending some more time going over the docs for linkparser, reading a bit of the original paper defining link-grammar, and spending some time looking at the data structures returned by parsing some sample sentences, linkparser makes a lot more sense.

Each parsed sentence has a set of LinkParser::Linkages, which, while important, seem to be slight syntactic sugar around lower level structs representing the actual links formed between the words. My hangup when I was looking at this was how to actually be able to ascertain more information about a sentence beyond its subject, verb, and object.

First up, each sentence (well, more importantly, its first linkage, as sentence delegates to it) has a set of links representing the connections between each word. These range from the banal pairing of a noun with its article, to the slightly more exotic pairing of a noun to an adjectival phrase. Links even exist which showcase adpositional phrases within the sentence.

Some extra work is needed to extract and figure out the relevant pieces of this information. However, it does not seem to be too taxing: each link has a label property, which describes what the link describes, and an lword and rword, representing the two connected words. (There is even a desc field, which may be used to learn about what the link represents.) The labels are regular and easily matchable. So, a quick approach to handling linkages would be to find the desired word (as an rword for most nouns), and see what links to it with a particular type of label.

Regarding WordNet: homebrew conveniently has a formula for the WordNet local software. It might be possible to just execute shell commands to the main binary (wn) and parse the output.

Treat is bulky. Installing it required a rather large download of a lot of additional material. It also has some external binary dependencies, which could be problematic. While it does make certain types of manipulations of sentences easy, I'm not sure the benefits outweigh the costs, especially as I better grok what linkparser gives me.

joshuabowers commented 11 years ago

Fascinating! Each linkage is a successful parsing of a sentence in the chosen language. However, given the complexities in attempting to parse context-free grammars, there isn't a single way to do this. Hence, multiple different ways to successfully link the component words together. Example using the sentence from a previous post:

d = LinkParser::Dictionary.new
s = d.parse("Open the red door with the bronze key.")
s.linkages.each {|l| puts l.diagram}

    +--------------------------Xp-------------------------+
    |        +----------MVp---------+                     |
    |        +-------Os-------+     +--------Js-------+   |
    |        |    +-----Ds----+     |   +------Ds-----+   |
    +---Wi---+    |    +---A--+     |   |      +--AN--+   |
    |        |    |    |      |     |   |      |      |   |
LEFT-WALL open.v the red.a door.n with the bronze.s key.n . 

    +--------------------------Xp-------------------------+
    |        +-------Os-------+     +--------Js-------+   |
    |        |    +-----Ds----+     |   +------Ds-----+   |
    +---Wi---+    |    +---A--+--Mp-+   |      +--AN--+   |
    |        |    |    |      |     |   |      |      |   |
LEFT-WALL open.v the red.a door.n with the bronze.s key.n . 

So, these are link diagrams for two of the possible linkages for this sentence. (Linkparser was able to create 6.) In these two cases, "the red door" is connected to "with the bronze key" via two different pathways. This first is via the MVp:

> s.linkages[0].links[2]
 => #<struct Struct::LinkParserLink lword="open.v", rword="with", length=4, label="MVp", llabel="MV", rlabel="MVp", desc="connects verbs and adjectives to modifying phrases that follow, like adverbs (\"The dog RAN QUICKLY\"), prepositional phrases (\"The dog RAN IN the yard\"), subordinating conjunctions (\"He LEFT WHEN he saw me\"), comparatives, participle phrases with commas, and other things."> 

while the second is done via Mp:

> s.linkages[2].links[5]
 => #<struct Struct::LinkParserLink lword="door.n", rword="with", length=1, label="Mp", llabel="M", rlabel="Mp", desc="connects nouns to various kinds of post-noun modifiers: prepositional phrases (\"The MAN WITH the hat\"), participle modifiers (\"The WOMAN CARRYING the box\"), prepositional relatives (\"The MAN TO whom I was speaking\"), and other kinds.">

That is, in the first case, the link is between the verb and a prepositional phrase, while the second is between a noun and a prepositional phrase. While the exact semantics are a bit different, the net result is something that I might be able to work with.