nytud / hunlp-GATE

Lang_Hungarian - a GATE plugin containing Hungarian NLP tools as GATE processing resources
GNU General Public License v3.0
8 stars 6 forks source link

Installation problems #5

Open DavidNemeskey opened 7 years ago

DavidNemeskey commented 7 years ago

I tried to follow Method 2 (for developers), and ran to the following problems:

  1. Tried to build the plugin (step 1/optional), but failed. I could only do it after running ./complete.sh, as instructed in step 2
  2. After that, the plugin compiled, but with warnings
Buildfile: /home/david/Research/hunlp-GATE/Lang_Hungarian/build.xml

init:
    [mkdir] Created dir: /home/david/Research/hunlp-GATE/Lang_Hungarian/bin

compile:
    [javac] Compiling 44 source files to /home/david/Research/hunlp-GATE/Lang_Hungarian/bin
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/anna-3.3.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/commons-io-2.4.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/CommonUtil.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/icu4j_3_6_1.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/log4j-1.2.9.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/magyarlanc-resource-2.0.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/mybatis-3.1.1.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/rfsa.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/stanford-corenlp-3.3.1.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/stanford-postagger-3.3.1.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/ThreadSafeMorphadorner.jar": no such file or directory
    [javac] warning: [path] bad path element "/home/david/Research/hunlp-GATE/Lang_Hungarian/resources/magyarlanc/lib/whatswrong-0.2.3-standalone.jar": no such file or directory
    [javac] 12 warnings

dist:
      [jar] Building jar: /home/david/Research/hunlp-GATE/Lang_Hungarian/hungarian.jar

BUILD SUCCESSFUL
  1. Then, when I run make GATE_HOME=/your/gate/installation/directory PIPELINE_INPUT=texts/peldak.xml pipeline, I get the following errors:
57.41.180  is2.parser.Parser 240:readModel ->          Reading data finnished
57.41.181  is2.parser.Extractor 56:initStat ->         mult  (d4) 
java.lang.StringIndexOutOfBoundsException: String index out of range: 2
    at java.lang.String.charAt(String.java:658)
    at hu.u_szeged.pos.converter.MSDToCoNLLFeatures.parseA(MSDToCoNLLFeatures.java:228)
    at hu.u_szeged.pos.converter.MSDToCoNLLFeatures.convert(MSDToCoNLLFeatures.java:731)
    at hu.u_szeged.dep.parser.MateParserWrapper.parseSentence(MateParserWrapper.java:44)
    at hu.nytud.gate.parsers.MagyarlancDependencyParser.parseSentence(MagyarlancDependencyParser.java:144)
    at hu.nytud.gate.parsers.MagyarlancDependencyParser.execute(MagyarlancDependencyParser.java:94)
    at hu.nytud.gate.pipeline.Pipeline.runPRs(Pipeline.java:202)
    at hu.nytud.gate.pipeline.Pipeline.main(Pipeline.java:226)
  1. Running only with QunToken + HFST (the last two lines of the config), it runs, but I see nothing in the result that indicates that morphological analysis has taken place.
  2. Magyarlanc, on the other hand, does tag the text with MSD tags. Which is NOT the tagset used on e-magyar.hu, or indeed, the tagset that should be used (this could be a separate issue, I think).

I don't know if the problems above are in fact bugs, or stem from inaccurate documentation (I suppose a bit of both), but it would be nice to see them resolved.

sassbalint commented 7 years ago

Thanks for the thorough description. The issues will be examined.

sassbalint commented 7 years ago

For the second part I created a separate issue, see Issue #6 "Pipeline is not working".