masashi-y / depccg

A* CCG Parser with a Supertag and Dependency Factored Model
MIT License
91 stars 28 forks source link

Throw IndexError when using --input-format partial #11

Closed kovvalsky closed 4 years ago

kovvalsky commented 5 years ago

I installed the latest version of depccg. The prolog format prints the derivation (fyi, I am using build directory rather than installing depccg):

$ echo "この T シャツ" | ext/depccg_portable/build/scripts-3.6/depccg_ja --silent -f prolog --pre-tokenized

:- op(601, xfx, (/)).
:- op(601, xfx, (\)).
:- multifile ccg/2, id/2.
:- discontiguous ccg/2, id/2.

ccg(1,
 fa(np:nc,
  t((np:X1/np:X1), 'XX', 'XX', 'XX/XX/XX/XX', 'XX', 'XX'),
  fa(np:nc,
   t((np:X1/np:X1), 'XX', 'XX', 'XX/XX/XX/XX', 'XX', 'XX'),
   t(np:nc, 'XX', 'XX', 'XX/XX/XX/XX', 'XX', 'XX')))).

But --input-format partial throws the error:

echo "この| T| シャツ|" | ext/depccg_portable/build/scripts-3.6/depccg_ja  --silent -f prolog --input-format partial --pre-tokenizedTraceback (most recent call last):
  File "/net/gsb/lib/python3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/net/gsb/lib/python3.6/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/net/gsb/depccg_portable/build/lib.linux-x86_64-3.6/depccg/__main__.py", line 262, in <module>
    args.func(args)
  File "/net/gsb/depccg_portable/build/lib.linux-x86_64-3.6/depccg/__main__.py", line 115, in main
    constraints=constraints)
  File "parser.pyx", line 253, in depccg.parser.EnglishCCGParser.parse_doc
  File "/net/gsb/depccg_portable/build/lib.linux-x86_64-3.6/depccg/ja_lstm_parser_bi.py", line 114, in predict_doc
    res.extend(self._predict(doc[i:i + batchsize]))
  File "/net/gsb/depccg_portable/build/lib.linux-x86_64-3.6/depccg/ja_lstm_parser_bi.py", line 99, in _predict
    xs = [self.extractor.process(x, self.xp) for x in xs]
  File "/net/gsb/depccg_portable/build/lib.linux-x86_64-3.6/depccg/ja_lstm_parser_bi.py", line 99, in <listcomp>
    xs = [self.extractor.process(x, self.xp) for x in xs]
  File "/net/gsb/depccg_portable/build/lib.linux-x86_64-3.6/depccg/ja_lstm_parser_bi.py", line 39, in process
    c[0, 0] = self.start_char
IndexError: index 0 is out of bounds for axis 1 with size 0