Open KienPM opened 4 years ago
Of course, you can use other architectures such as HAN to process the news body, and the performance is usually slightly better. But it usually requires a larger GPU memory/smaller batch size.
Yeah, I've tried to encode each sentence then encode article body on 2080Ti GPU. I can only train with batch size = 1 and it took 25s/step, maybe something went wrong. What does HAN stand for? Can you please share me a reference. Thank you very much!
HAN stands for Hierarchical Attention Network, right?
Yeah, HAN means Hierarchical Attention Network (Yang et al., 2016). You can replace the LSTM with CNN to boost the training speed.
Yeup, thank you! I've read some of your papers, they are awesome
In addition, it is highly recommended that you can use a smaller sentence length or fewer sentences. Although I believe that using the full news body is beneficial, it takes a large amount of GPU memory and the improvement is usually marginal.
Yeah, thanks for your recommendation
Thank you for sharing this great repository! Can you share me the reason why you consider article body as a long sequence instead of sentences. If I want to encode each sentence then use sentences represent vector to encode article body, is it possible?