saulalbert / unixclan

Utility scripts for TalkBank's CLAN
0 stars 0 forks source link

CHAT2CAlite converts any 'ˌ' symbol to a normal period '.' #17

Closed saulalbert closed 6 years ago

saulalbert commented 6 years ago

CHAT doesn't allow mid-TCU periods (they're used as utterance terminators)

So we'll introduce a 'ˌ' character to use in place of mid-TCU periods 'ˌ'

(U+02CC : MODIFIER LETTER LOW VERTICAL LINE)

So all 'ˌ' marks should be converted to '.' like so:

helloˌ how are you

should be converted into

hello. how are you

saulalbert commented 6 years ago

Again - rather than messing around with positions (i.e. are we at the end of the turn), we'll just use a new unicode symbol.