SALT-NLP / Structure-Aware-BART

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"
MIT License
64 stars 8 forks source link

How to extract the discourse graph? #5

Closed chijames closed 3 years ago

chijames commented 3 years ago

Hi,

Can you elaborate a little bit more about the process/code of constructing the discourse graph files?

Thanks.

jiaaoc commented 3 years ago

I used the codes in https://github.com/shizhouxing/DialogueDiscourseParsing to train the parser, and then directly use the trained model to predict the discourse relations.

chijames commented 3 years ago

Yes, I notice that in the readme. My question is: how to use that trained parser to generate discourse graphs?

chijames commented 3 years ago

The reason I am asking this is because I have another parser, and I want to test its downstream performance on your model. Thanks.

jiaaoc commented 3 years ago

You need to preprocess the files in the desired format (https://github.com/shizhouxing/DialogueDiscourseParsing) and modify several lines in their main.py (https://github.com/shizhouxing/DialogueDiscourseParsing/blob/master/main.py) to save the predictions.

jiaaoc commented 3 years ago

Should be pretty easy and straightforward

chijames commented 3 years ago

So which file in the data/ in this repo should I refer to? (To make sure the format is correct)

jiaaoc commented 3 years ago

If that, once you predicted the relations between utterances, you could just construct the adjacent matrix based on your predicted results.

jiaaoc commented 3 years ago

should be *_relation_adj.pkl

jiaaoc commented 3 years ago

please refer to the codes here (https://github.com/GT-SALT/Structure-Aware-BART/blob/main/src/utils.py)

chijames commented 3 years ago

Sorry for the late reply. So as far as I understand, we are assuming the max length of a dialogue is 30. For the adjacency matrix, I guess 0 means no link/relation, but how about numbers from 1 to 17? I suppose there are only 16 types of relations? Please correct me if I misunderstand something. Thanks!

jiaaoc commented 3 years ago

you could have edges with different types or just view them as the edges

jiaaoc commented 3 years ago

please refer to this: https://github.com/GT-SALT/Structure-Aware-BART/blob/a227de631c7217f0351adac667142fea3c9d5800/transformers/src/transformers/modeling_bart.py#L1168

jiaaoc commented 3 years ago

So you could have 0 means no link/relation, and the numbers 1 to 17 referring to different types of relations

jiaaoc commented 3 years ago

or you could just use 0/1 without considering the edge types

chijames commented 3 years ago

Yes, but why are there 17 types? In the paper, the relation types is 16: "where each Elementary Discourse Unit (EDU) is one single utterance and they are linked through 16 different types of relations (Asher et al., 2016)." Thanks.

jiaaoc commented 3 years ago

there should be an "other" category if I remembered correctly

jiaaoc commented 3 years ago

oh, I just checked, 0~15 are edge types, 17 means no edge. there should be no 16 in the matrix.

Here is the mapping: {u'Comment': 5, u'Clarification_question': 4, u'Contrast': 10, u'Elaboration': 1, u'Acknowledgement': 9, u'Continuation': 7, u'Alternation': 0, u'Explanation': 11, u'Q-Elab': 3, u'Conditional': 12, u'Result': 6, u'Background': 15, u'Narration': 14, u'Correction': 13, u'Parallel': 8, u'Question-answer_pair': 2}

chijames commented 3 years ago

Thank you so much!

jiaaoc commented 3 years ago

Oh 16,17 might mean self-linked or no edges. I forgot the details, but there are three types of edges: discourse relations, no edges, self-linked edges

jiaaoc commented 3 years ago

The specific id does not matter, you could define your own adjacent matrix.

jiaaoc commented 3 years ago

Here is the code for that part. The pickle files consist edges (start, end, type)

def get_adj_list(name):
    with open(name + '.pickle', 'rb') as f:
        edges = pickle.load(f)
    adj_list = []
    if name == 'train':
        max_id = 46
    else:
        max_id = 30
    for i in range(0, len(edges)):
        temp = np.array(edges[i][0])
        adj = sp.coo_matrix((temp[:,2]+1, (temp[:, 1], temp[:, 0])), shape=(max_id, max_id), dtype=np.float32)
        adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)
        adj = adj + sp.eye(adj.shape[0])*17
        adj_list.append(np.array(adj.todense()))
    with open(name + '_relation_adj.pkl', 'wb') as f:
        pickle.dump(adj_list, f)
    return adj_list
jiaaoc commented 3 years ago

so 0 means no edge, 1~16 are referring to the mapping dict (with index +1), 17 means self-linked edge

chijames commented 3 years ago

Thanks for sharing the details! I can imagine that the absolute values do not matter as long as there exists a 1-1 mapping. However, I was checking the discourse parser repo today, and it seems to me that it will always predict one of the preceding utterances (?) as the parent of the current utterance. Therefore, I really cannot make sense why there is a self-linked edge... I might misunderstand something here, please correct me if I am wrong. Thanks!

jiaaoc commented 3 years ago

self-lined edge is just the diagonal in the adjacent matrix. (This could be omitted). The edges are used to calculate attention weights in the graph attention network.

jiaaoc commented 3 years ago

It kind of like residual, when one node is aggregating information from neighbours, it also self-added.

chijames commented 3 years ago

Ok, I got you. Basically, I just need to assign the diagonal elements as 17.