gossebouma / lassy2ud

Lassy Small to Universal Dependencies Conversion
BSD 2-Clause "Simplified" License
6 stars 1 forks source link

Conversion fails on just a few LassyKlein sentences #4

Open JessedeDoes opened 3 years ago

JessedeDoes commented 3 years ago

(This may be an input data data problem)

I ran the script https://github.com/gossebouma/lassy2ud/blob/master/universal_dependencies.xq on all sentences of a LassyKlein version. It works very well, but it fails on a few sentences: problem_sentences.zip

Treebank/WR-P-E-I-0000020972/WR-P-E-I-0000020972.p.4.s.192.xml Treebank/WR-P-E-I-0000020972/WR-P-E-I-0000020972.p.4.s.177.xml Treebank/WR-P-E-I-0000020972/WR-P-E-I-0000020972.p.4.s.215.xml Treebank/WR-P-E-I-0000020972/WR-P-E-I-0000020972.p.4.s.143.xml Treebank/WR-P-E-I-0000020972/WR-P-E-I-0000020972.p.4.s.200.xml Treebank/WR-P-E-I-0000020972/WR-P-E-I-0000020972.p.4.s.164.xml

Error on line 350 column 23 of universal_dependencies.xq:
  XQDY0025  Cannot create an element having two attributes with the same name: @begin
     invoked by function call at file:/home/jesse/workspace/lassy2ud/universal_dependencies.xq#353
     invoked by function call at file:/home/jesse/workspace/lassy2ud/universal_dependencies.xq#353
     invoked by function call at file:/home/jesse/workspace/lassy2ud/universal_dependencies.xq#353
     invoked by function call at file:/home/jesse/workspace/lassy2ud/universal_dependencies.xq#353
     invoked by function call at file:/home/jesse/workspace/lassy2ud/universal_dependencies.xq#1563
     invoked by unknown caller (class net.sf.saxon.value.MemoClosure)
     invoked by function call at file:/home/jesse/workspace/lassy2ud/universal_dependencies.xq#1579
Query failed with dynamic error: Cannot create an element having two attributes with the same name: @begin
gossebouma commented 3 years ago

These sentences look familiar! I commited a newer version of the conversion script where the problem is fixed. PS We switched to a go implementation for the conversion, see https://github.com/rug-compling/alud, so this package is no longer actively maintained.