zihaohe123 / speak-turn-emb-dialog-act-clf

24 stars 8 forks source link

Mapping from label to SWDA Acts #3

Closed kstats closed 2 years ago

kstats commented 2 years ago

Hi - super cool work and thank you for releasing your code!

I was wondering if I could get some clarification of how the speech acts were converted to integer labels from the original SWDA tags (i.e. Table in 3.1 here: https://compprag.christopherpotts.net/swda.html). The int labels don't seem to align with the original SWDA table ordering.

While I can extrapolate the larger classes by frequency in the dataset, an explicit mapping of this would be super useful, especially for the less-common classes! Thank you!

zihaohe123 commented 2 years ago

Hi -- Thanks for your interest. Actually I wasn't paying attention to the ordering when converting the acts to integer labels. I just used a random ordering. Can you explain why this matters? It seems to me that the ordering doesn't matter when training the model.

This is the mapping I used. {'qw^d': 0, '^2': 1, 'b^m': 2, 'qy^d': 3, '^h': 4, 'bk': 5, 'b': 6, 'fa': 7, 'sd': 8, 'fo_ofw"_by_bc': 9, 'ad': 10, 'ba': 11, 'ng': 12, 't1': 13, 'bd': 14, 'qh': 15, 'br': 16, 'qo': 17, 'nn': 18, 'arp_nd': 19, 'fp': 20, 'aap_am': 21, 'oo_co_cc': 22, 'h': 23, 'qrr': 24, 'na': 25, 'x': 26, 'bh': 27, 'fc': 28, 'aa': 29, 't3': 30, 'no': 31, '%': 32, '^g': 33, 'qy': 34, 'sv': 35, 'ft': 36, '^q': 37, 'bf': 38, 'qw': 39, 'ny': 40, 'ar': 41, '+': 42}

kstats commented 2 years ago

Thanks so much for the quick response! Im interested in running prediction using the swda model on my own dataset. The explicit data mapping will help me interpret these results!