output issue - Githubissues

wac81 commented 2 years ago

i get output issue：

ID : 6 ORIGIN : (0) 【客服】一楼楼层较低，东户的采光被旁边的巨鼎峰集团挡了一些。 (1) 【客服】嗯。 (2) 【客服】呃那个房子是八十七。 (3) 【客户】八十七户型是啥样的？ GOLD : 客户询问一楼的面积，销售回答八十七平。 DOC_GEN : 客户询问八十七平户型的价格，销售回复八十九万 DOC_GEN bleu & rouge-f 1/2/l: 0.3309 & 0.6842/0.4737/0.5789 EXT_GOLD: [0,2] EXT_PRED: [0,3] EXT_SCORE P/R/F1: 0.5000/0.5000/0.5000

two questions：

doc_gen just “saw” 4 sentences？0，1，2，3？
extract_pred just extract only in these 4 sentences？
EXT_PRED seems useless

RowitZou commented 2 years ago

It is a two-stage process. The first is to extract important sentences from the original dialogue. The second is to generate a summary based on the extracted sentences from the first stage.

doc_gen can only see sentence 0 and 3, which is extracted by the first stage.
Yes.
ext_pred means sentences that are input into the second stage.

wac81 commented 2 years ago

EXT_GOLD: [0,2] ext_gold : how to extract these sentences? i know preprecess can be extract , which method?

RowitZou commented 2 years ago

src/prepro/data_builder.py def greedy_selection(...)

RowitZou / topic-dialog-summ

output issue #29

i get output issue：