mrmechko / pytrips

GNU General Public License v2.0
6 stars 2 forks source link

Overly reliant on string prefixes (Query wordnet mapping without `q::` prefix) #38

Open LouisJenkinsCS opened 4 years ago

LouisJenkinsCS commented 4 years ago

Hello developers,

While using PyTrips, I have found time and time again that functionality of PyTrips is overly reliant on strings, as it appears that the prefix q::, wn::, ont::, etc., are all required to obtain appropriate functionality. This has been a significant source of frustration for me, and I'd like to provide an example as to why and how I would hope the API that PyTrips provides can be dramatically improved.

In my final project for Natural Language Processing, I have an AMR parse of the following sentence...

Sentence: Hallmark could make a fortune off of this guy.

This produces an AMR parse that looks like the following...

(p / possible
      :domain (m / make-05
            :ARG0 (c / company :name (n / name :op1 "Hallmark"))
            :ARG1 (f / fortune
                  :source (g / guy
                        :mod (t / this)))))

While I have been mostly successful by using TRIPS' lexicon (get_word) to obtain the possible ontological mappings for all but fortune, which produces what I believe to be a nonsensical ontological type: ont::cookies. It is honestly so far out of left-field that it causes me to reconsider how to proceed with parsing this given that it is so far from what it is supposed to do that I can't, say, choose based on what is more likely given the only provided candidate is clearly wrong. Here, I have tried to obtain the wordnet mapping (get_wordnet), but it produces an empty list. The definition (get_definition) throws an exception, and lookup requires a pos.

Now after spending more time than I should have, I eventually found that I can obtain the wordnet mappings, but only by invoking make_query('q::fortune') which returns a dictionary of exactly what I want to see...

{'lex': [ont::cookies],
 'wn': [ont::assets, ont::luckiness-scale, ont::situation]}

My issue is: Why doesn't get_wordnet return this? I am working with strings, yes, but I feel as if I shouldn't have to prepend a q:: to each query, and that instead there should be explicit functions and/or methods that can produce the same results. I.E, if get_wordnet produced the [ont::assets, ont::luckiness-scale, ont::situation], I would be satisfied enough. I am not certain what it does right now. Also lookup requires a pos, in which I cannot find documentation as to what it actually means or expects.

I am requesting that PyTrips provides some kind of enhanced API that can appropriately obtain this type of information without relying on string manipulation.

LouisJenkinsCS commented 4 years ago

As well, I would like to request the following enhancement: Provide a way to obtain all candidate ontological types for a word, position-independent by default (default argument) that gathers candidates from both TRIPS lexicon and wordnet mappings as a set.

def get_word_all(self, word, pos=None):
   ret = set()
   for mapping in self.make_query(word, pos):
      for typ in mapping:
         set.add(typ)
   return list(ret)

I think it could be useful given that the algorithm for determining which candidate to use is likely robust enough to handle additional candidates that are produced. This way given get_word_all('fortune') you would receive [ont::cookies, ont::assets, ont::luckiness-scale, ont::situation] as output.

mrmechko commented 4 years ago

Hi Louis, thanks for your input. I agree that pytrips has some unusual (annoying) quirks that I’d like to smooth out. The “ont::”, “w::”, and “q::” prefixes are reflections of how types are represented as package symbols in the lisp code. I initially wrote out pytrips as a convenience to explore the ontology and mappings in a python repl and a library kind of formed around it. I would love to build a consistent interface for it.

I’d like to also draw your attention to mrmechko/tripscli which has many parser/parse level functions which need to be merged into pytrips. If you are/were working on amr parses in conjunction with trips, is like to know more about your project as I just finished writing some code to align amr and trips parses.

@bavalpey @hannah0n any input you guys have would be appreciated

mrmechko commented 4 years ago

Oh and wrt to the dict type return for “q::” queries, lexicon mappings and wordnet mappings are treated differently by the parser. lex mappings have explicitly encoded syntactic templates in the lexicon whereas wordnet mappings have to infer the syntactic template. If you open a pull request with additional query functions I would be happy to merge. The __getitem__ hack is very convenient for debugging sessions so I might leave that interface as is.

mrmechko commented 4 years ago

Yet another comment: pos should be one or more of "nvar", which are the parts of speech for wordnet. I added a docstring reflecting that.