Closed sb-b closed 6 years ago
Hi! This is conllu format, the parser only handles conll format. Please see the universal dependencies scripts.
Miguel
Hi,
I couldn't find an appropriate script for converting conll-u files to conll files. I will be glad if you can suggest me a script for this task.
Thanks,
Betul
On Wed, Feb 14, 2018 at 3:18 PM, Miguel Ballesteros < notifications@github.com> wrote:
Hi! This is conllu format, the parser only handles conll format. Please see the universal dependencies scripts.
Miguel
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/clab/lstm-parser/issues/26#issuecomment-365590422, or mute the thread https://github.com/notifications/unsubscribe-auth/AEEx3sCE39H-ErylAeEh8S8zfF5aX8w3ks5tUs7_gaJpZM4SFE1v .
It worked, thank you!
On Wed, Feb 14, 2018 at 8:57 PM, Miguel Ballesteros < notifications@github.com> wrote:
I believe this is the one: https://github.com/UniversalDependencies/tools/ blob/f21108176ff431ebbab4c9414d6e0345e62d3995/conllu_to_conllx.pl
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/clab/lstm-parser/issues/26#issuecomment-365691455, or mute the thread https://github.com/notifications/unsubscribe-auth/AEEx3n2nfnstJ8In9Wb0pu41MXnGbx9_ks5tUx6QgaJpZM4SFE1v .
Hi,
I am trying to train this parser on Turkish UD Treebank. When I run this command:
java -jar ParserOracleArcStdWithSwap.jar -t -1 -l 1 -c training.conll > trainingOracle.txt
I got the following error:
The conllu parse the lstm parser gives error is the one below:
The word 'parçacıklarsa' is a multiword token, so it is numbered as '2-3'. Does lstm parser have a mechanism to deal with multiword tokens? How can I solve this issue?
Thanks,
Betul