Closed kajad closed 5 years ago
some sentence-final tokens now lose the SpaceAfter=No info in MISC, e.g. ssj2.2.11 (@TomazErjavec, is this expected?)
Yes, this is the change introduced in 65d7d34. (Arguably) no sentence should end with SpaceAfter=No.
If this is still problematic on @TomazErjavec 's side,
Not problematic, both scripts work as expected.
I have now updated the
convert_dependencies.py
script in 1c3b305, so as to conform to the new morphological input. The resulting treebank (UDv2.4) is identical to UDv2.3, except for:Polarity=Neg
is attributed to all 'ne' particles (this is expected)DET
, e.g. 'obilo' (this is expected)# text
comment lines involving different types of punctuation have been corrected, e.g. ssj27.156.638 (this is expected)the SpaceAfter=No
info in MISC, e.g. ssj2.2.11 (@TomazErjavec, is this expected?)I have also re-introduced the
encoding=utf8
declarations for writing and reading, as this inhibited testing in Windows command line. The script also works on linux (tantra). If this is still problematic on @TomazErjavec 's side, we need to find a universally acceptable solution.