geyingli / unif

基于 Tensorflow,仿 Scikit-Learn 设计的深度学习自然语言处理框架。支持 40 余种模型类,涵盖语言模型、文本分类、NER、MRC、知识蒸馏等各个领域
Apache License 2.0
114 stars 27 forks source link

question on wide and deep #12

Open kiminh opened 2 years ago

kiminh commented 2 years ago

Hi, I've noticed that you have implemented the wide and deep structure which is differnt from the classical "youtube wide and deep". Here is my question: 1) what is the input of wide side 2) what's the purpose to use attention mechanism between wide side and deep feature.

Thanks a lot.

geyingli commented 2 years ago

I think the core value of wide and deep structure is the thought to structurally unify discrete and continuous features. So I wasn't intented to follow all the details from the original work. Another reason is that there are outstanding ideas proposed after wide and deep model came out, like attention machanism and BERT, which could further enhance the performance of the model.

The answers are:

  1. The input of wide side could be any discrete features. It can be a text string, an interger or even float if you want.
  2. Attention machanism was proved to be a successful design. Using it properly improves the performance of NLP tasks (in most of times).

Hope this reply meet your needs :)