saulalbert / unixclan

Utility scripts for TalkBank's CLAN
0 stars 0 forks source link

Convert CHAT <doubful material> [?] to CA (doubtful material) #12

Closed saulalbert closed 6 years ago

saulalbert commented 6 years ago

In CA when a transcriber can't hear something clearly they use single parentheses:

(doubtful material)

in CHAT they use the following syntax:

<doubful material> [?]

CHAT2CAlite should convert <doubful material> [?] into (doubtful material)

mumair01 commented 6 years ago

The original converter addresses this issue to an extent. Firstly, the CHAT format does not have < text > [?]. Instead, it has examples of text [?]. Secondly, when it encounters these markers, what it does is that it inserts only the LAST word before [?] into ( last word ). Do you think this functionality needs to be modified in some way?

saulalbert commented 6 years ago

Yes - there are two ways to do this in CHAT - one targets only the word just before the [?] - according to the CHAT manual here - https://talkbank.org/manuals/CHAT.pdf (section 10.3) , if you enclose text before the [?] with <angled brackets> it ought to put everything in those brackets into the (parentheses).