spyysalo / annodoc

Annodoc annotation documentation support system
http://spyysalo.github.io/annodoc
MIT License
35 stars 57 forks source link

Preventing whitespace insertion after tokens #19

Open heacu opened 7 years ago

heacu commented 7 years ago

Is there any plan to support CoNLL-U's SpaceAfter=No attribute that can be put in its MISC column? I am writing annotation guidelines for Tibetan, which like Chinese does not put whitespace between words. In an example such as the following:

~~~ conllu
1       བཅོམ་ལྡན་འདས་     བཅོམ་ལྡན་འདས་     NOUN    n.count Number=Sing     _       _       _       SpaceAfter=No
2       ཤཱཀྱ་སེང་གེ་        ཤཱཀྱ་སེང་གེ་        PROPN   n.prop  _       _       _       _       SpaceAfter=No
3       ལ་      ལ་√case ADP     case.all        _       2       case    _       SpaceAfter=No
4       ཕྱག་     ཕྱག་     NOUN    n.count Number=Sing     _       _       _       SpaceAfter=No
5       འཚལ་    _       VERB    v.fut.v.pres    Tense=Fut,Pres  _       _       _       SpaceAfter=No|[འཚལ་√1][འཚལ་√2]
6       ལོ       འོ་√cv   PART    cv.fin  Mood=Ind        5       discourse       _       SpaceAfter=No
7       །       །       PUNCT   punc    _       5       punct   _       _
~~~

There should be no whitespace between words when rendering Tibetan. Unfortunately the Annodoc tool renders this example with spaces and so seems to be unusable for Tibetan.

Perhaps, though, I am missing an easy way to fix this with CSS?

heacu commented 7 years ago

Fortunately, however, the .ann standoff format preserves the integrity of the text, so I can use that in Annodoc instead.