howardyclo / Kaggle-Quora-Question-Pairs

This is our team's solution report, which achieves top 10% (305/3307) in this competition.
62 stars 14 forks source link

Can you share some tips about how to choose neural network structures? #3

Open dlutleixin opened 6 years ago

dlutleixin commented 6 years ago

In Model section, you show two neural network framework,e.g Glove embeddings with 1D-CNN and FastText embedding with single layer LSTM. It's hard for me to figure out these NNs. Can you share some tips about how to choose neural network structures? Are there any fancy ideas make you and you partner to figure out these NNs? Thank you very much!

howardyclo commented 6 years ago

Hi @dlutleixin

Nowadays, figuring out the most appropriate structure of NN has been a challenge task. Usually, for language understanding task, we'll use LSTM (good at capturing global long-term dependencies). For text classification, we'll use CNN (good at captures local n-gram information). As for different word embeddings like word2vec, glove or fasttext, it is also hard to figure out which one is better. But among in them, I think fasttext is better since it can capture character information, making it better in handling out-of-vocabulary (rare) words.

It is worth noting that, recently, there has been novel architectures proposed like self-attention + convolution for question answering domain (QANet), which captures both advantages of CNN and LSTM. Also for word embeddings, there has been also a state-of-the-art embeddings that both capture context and character information, called "ELMo".

In summary, if you don't know which kind of NN or embeddings to use, I recommend you to read papers for further learning. But if you're in a competition like Kaggle, many people will try to ensemble them all in order to boost the performance.

Cheers.