dmmiller612 / bert-extractive-summarizer

Easy to use extractive text summarization with BERT
MIT License
1.38k stars 305 forks source link

what's the content of 'pooled' #55

Closed Gliterament closed 2 years ago

Gliterament commented 4 years ago

In the function 'extract_embeddings', it says that the function returns a numpy array, I'm wondering what are the elements of the array and I was thinking that the 'pooled' might refer to a float that is the average of (batch_size, sequence_length, hidden_size) in a torch tensor. Please help me figure this out, it's very important to me.

dmmiller612 commented 4 years ago

The element of the array depends on what the parameters sent to it. By default, it retrieves the second to last hidden layer sentence embeddings. This has to do with it generally being a better representation than the last layer. The average depends on the axis sent to it. Generally the dimension is set to 1.