XuhuiZhou / CDA

code for our EMNLP2020 paper: Multilevel Text Alignment with Cross-Document Attention by Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith
12 stars 7 forks source link

How to obtain document vectors? #4

Open adityavadalkar opened 3 years ago

adityavadalkar commented 3 years ago

Hey, I have 2 documents and I need to obtain the document vectors with cross document attention as well as calculate the attention scores on the sentences as described in your paper. Please guide me on how to do that/which scripts to run. Thanks in advance!

XuhuiZhou commented 3 years ago

Hi! Obtaining the document vector and attention scores between sentences should not be hard. I will take the bert_han_sg_g.py file as an example. You can modify any other test files we provided to obtain the same stuff.

In that file, see line 77 and line 78,

output_1 = output_1.permute(1,0,2)
output_2 = output_2[:,-1,:].squeeze()

You can obtain the document vector for the first document using: output_1[-1,:,: ] For the second document, just simply use the output_2 would be fine.

For the cross-document attention, they are output_1[:-1,:,:]