AnantLabs / link-parser

Automatically exported from code.google.com/p/link-parser
0 stars 0 forks source link

Version difference between AbiWord and Carnegie Mellon University. #17

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I am using Link Grammar 4.3.3 released by AbiWord, but found it produces
problematic outputs for some sentences. I checked these sentences with the
version 4.1b released by CMU, found there is no any problem. I looked into
the source, found more big differences. For example, for AbiWord one, to
use the APIs, you need file "link-includes.h", but for the CMU one, it's
"api.h". And the specific functions exposed in the lib are also quite
different.
FYI, try the the sentence below and compare it with the CMU's online demo
or their API. The AbiWord parser cannot correctly identify the 'O' link
between 'gives' and 'book'.

Tom gives the university boy a business book.

Original issue reported on code.google.com by ni...@it.uts.edu.au on 5 Mar 2008 at 12:50

GoogleCodeExporter commented 9 years ago
Yes, sorry. I introduced a regression in the dictionary, somewhere around 
the 4.3.1 time frame, having to do with the capitalization of words at 
the start of sentences (such as "Tom" in the above sentence). 

This is now fixed in the svn source tree, and will be in version 4.3.4.
Would you be able to test the version from svn?

Original comment by linasvep...@gmail.com on 5 Mar 2008 at 2:25

GoogleCodeExporter commented 9 years ago

Original comment by linasvep...@gmail.com on 5 Mar 2008 at 2:25

GoogleCodeExporter commented 9 years ago
Thanks for the reply. But I don't know what 'svn' stands for?

Original comment by ni...@it.uts.edu.au on 6 Mar 2008 at 5:22

GoogleCodeExporter commented 9 years ago
svn is "subversion", the source code repository where the code is located.

Rather than having you learn svn, perhaps it would be easier to try
the patch below. You can apply it with the "patch" command, if you 
are familiar with that, or it can even be applied manually: the lines
with "+" in front of them are to be added, and those with "-" are to 
be removed.  The changes are to be made a few lines below line 69 in
the file 4.0.dict. Be sure you keep around an old copy of the file,
in case the patch doesn't work.

Let me know if this fixes all of your problems, or whether there
are others.

Index: 4.0.dict
===================================================================
--- 4.0.dict    (revision 23006)
+++ 4.0.dict    (working copy)
@@ -69,10 +69,18 @@
 <noun-sub-p>: {@M+} & {R+ & Bp+ & {[[@M+]]}} & {@MXp+};

 % Just pure singular entities, no mass nouns 
-<entity-singular>:
+% Note that "CAPITALIZED-WORDS" has special meaning within the link parser.
+% These are the words that can occur at the start of the sentence, and be
+% capitalized ...
+CAPITALIZED-WORDS NAME <entity-singular>:
 ({G-} & {[MG+]} & (({DG- or [[GN-]] or [[{@A-} & {D-}]]} & 
 (({@MX+} & (JG- or <noun-main-s>)) or YS+)) or AN+ or G+));

+% Plural of words that can appear at the start of a sentence.
+PL-CAPITALIZED-WORDS: 
+{G-} & {[MG+]} & (({DG- or [[GN-]] or [[{@A-} & {Dmc-}]]} & 
+(({@MX+} & (JG- or <noun-main-x>)) or YS+ or YP+)) or AN+ or G+); 
+
 % capitalized words ending in s
 % -- hmm .. proper names not used anywhere right now, has slot for plural ... !!??
 <proper-names>:

Original comment by linasvep...@gmail.com on 6 Mar 2008 at 6:13

GoogleCodeExporter commented 9 years ago
Hi linasvepstas,
I tried the patch, but I can't make it work. My program just crashed with it.
Rechecking the problem, I found the problematic sentence is like the one below:

"Tom gives the black university boy a business book."

It seems that if both the direct object noun(boy) of the sentence and the 
indirect
object (book) have prenoun adjective modifiers (black) as well as prenoun noun
modifiers (university and business), the parser will not be able to recognize 
the
indirect object.

Regards
Li

    +---------------------------------Xp---------------------------------+
    |            +---------------Os--------------+                       |
    |            |     +------------Ds-----------+                       |
    |            |     |     +---------A---------+   +-------Ds------+   |
    +--Wd--+--Ss-+     |     |          +---AN---+   |      +---AN---+   |
    |      |     |     |     |          |        |   |      |        |   |
LEFT-WALL Tom gives.v the black.a university.n boy.n a business.n book.n . 

(S (S (NP Tom)
      (VP gives
          (NP the black university boy)))
   a business book .)

Original comment by ni...@it.uts.edu.au on 6 Mar 2008 at 9:19

GoogleCodeExporter commented 9 years ago
That's the same parse that I get; it works fine for me.

If your program crashes with the patch, but not without it, 
please double check the patch. Did you remember to remove 
the +'s in front? Did you delete the line with the - in front?

Do you know how to use IRC? We might be able to discuss this
more quickly on #opencog at freenode.net

Original comment by linasvep...@gmail.com on 6 Mar 2008 at 9:26

GoogleCodeExporter commented 9 years ago
Sorry, I am not familiar with IRC. Is it a chatroom?
I re-patched the dict and finally it worked. But the parse is still the same. 
FYI,
the attached is the patched dict. I am not quite sure if what I have done is 
correct.

Thanks.
Li

Original comment by ni...@it.uts.edu.au on 6 Mar 2008 at 9:44

Attachments:

GoogleCodeExporter commented 9 years ago
IRC is "Internet Relay Chat", its ... a very old chat system.

The attached dict file looks good to me.  However, I am now no longer
able to reproduce your original failure, with or without the patch.
(I thought I had reproduced your original failure, but now I am not 
sure).

Can you verify that you don't have some alternate dictionary installed
in some other place, e.g. /usr/share/link-grammar vs. 
/usr/local/share/link-grammar ?  You should probably verify that 
you are not accidentally linking to the wrong shared library, etc.

I am unclear on how to continue debugging at this point, as I can't
reproduce the problem.

Original comment by linasvep...@gmail.com on 6 Mar 2008 at 10:59

GoogleCodeExporter commented 9 years ago
Maby I need to clarify the problem. Initially I said the first sentence caused 
problem: 

  (1) "Tom gives the university boy a business book."
  (2) "Tom gives the black university boy a business book."

Unfortunately, I made a mistake, both the two parsers work well with the first
sentence. But it's the second one that makes the difference.
I also tried the API from CMU, it also cannot produce the right parse, although 
their
online demo can (http://www.link.cs.cmu.edu/link/submit-sentence-4.html). Sorry 
about
my mistake. 

Thanks for your kind reply, I'll re-check all files accordingly.

Original comment by ni...@it.uts.edu.au on 6 Mar 2008 at 11:23

GoogleCodeExporter commented 9 years ago
Both sentences work for me.

Original comment by linasvep...@gmail.com on 6 Mar 2008 at 11:35

GoogleCodeExporter commented 9 years ago
Marking as "fixed".

Original comment by linasvep...@gmail.com on 14 Apr 2008 at 4:55