Want To Know The Data Format

Thelordofdream / Summarization-Systems

A end-to-end summarization systems using unsupervised learning based on attensive network

0 stars 0 forks source link

Want To Know The Data Format #1

Open Scagin opened 6 years ago

Scagin commented 6 years ago

I'm happy to see the code you wrote. It's really a magical attention model. Can you please provide a data set, I would like to know what the data format is, to facilitate my improvement of the project. Here is my contact 406493851@qq.com. By the way, is this the realization of a certain document? Thank you!!

Thelordofdream commented 6 years ago

The dataset is the original aclImdb which is a published movie review dataset can be downloaded from the Internet. It’s worth mentioning that you should control the text length for the length limitation of LSTM，so that you should construct a subset of aclImdb.

Thelordofdream commented 6 years ago

For the input layer，the input is embedded test sequence，the tagert output is the classe or label of the input text. And then，you train the model and use this model to extract one sentence summary from the input text.

Thelordofdream commented 6 years ago

The attention model implemented here is actually Attentive Reader proposed by DeepMind in 2016. The query here is the movie review text as well as the document，so the model is a modified self-attention version.