RowitZou / topic-dialog-summ

AAAI-2021 paper: Topic-Oriented Spoken Dialogue Summarization for Customer Service with Saliency-Aware Topic Modeling.
MIT License
77 stars 9 forks source link

output issue #29

Closed wac81 closed 2 years ago

wac81 commented 2 years ago

i get output issue:

ID : 6 ORIGIN : (0) 【客服】 一 楼 楼 层 较 低 , 东 户 的 采 光 被 旁 边 的 巨 鼎 峰 集 团 挡 了 一 些 。 (1) 【客服】 嗯 。 (2) 【客服】 呃 那 个 房 子 是 八 十 七 。 (3) 【客户】 八 十 七 户 型 是 啥 样 的 ? GOLD : 客 户 询 问 一 楼 的 面 积 , 销 售 回 答 八 十 七 平 。 DOC_GEN : 客 户 询 问 八 十 七 平 户 型 的 价 格 , 销 售 回 复 八 十 九 万 DOC_GEN bleu & rouge-f 1/2/l: 0.3309 & 0.6842/0.4737/0.5789 EXT_GOLD: [0,2] EXT_PRED: [0,3] EXT_SCORE P/R/F1: 0.5000/0.5000/0.5000

two questions:

  1. doc_gen just “saw” 4 sentences?0,1,2,3?
  2. extract_pred just extract only in these 4 sentences?
  3. EXT_PRED seems useless
RowitZou commented 2 years ago

It is a two-stage process. The first is to extract important sentences from the original dialogue. The second is to generate a summary based on the extracted sentences from the first stage.

  1. doc_gen can only see sentence 0 and 3, which is extracted by the first stage.
  2. Yes.
  3. ext_pred means sentences that are input into the second stage.
wac81 commented 2 years ago

EXT_GOLD: [0,2] ext_gold : how to extract these sentences? i know preprecess can be extract , which method?

RowitZou commented 2 years ago

src/prepro/data_builder.py def greedy_selection(...)