amrisi / amr-guidelines

239 stars 86 forks source link

Segmenting utterances for speech AMR #252

Open luciaelizabeth opened 2 years ago

luciaelizabeth commented 2 years ago

Hello, is there any convention for how to separate continuous speech into utterances that can then map on to AMR? Related, is there a consistent practice for how AMR treats fillers and disfluencies?

Some specific examples: If there is a 2+ second pause in the middle of a "sentence", should it be separated into two AMRs? (E.g. "put the block ... behind the other one.")

Or if someone says something like "Put that there yes" (without pause), where the final "yes" confirms that the action was performed correctly (a different function), should these be two different AMRs?

Thank you!