zhaolewen / DrQA-TF

DrQA with Tensorflow
12 stars 4 forks source link

Not completed yet? #1

Open AndreCI opened 7 years ago

AndreCI commented 7 years ago

Hi,

Thanks a lot for porting this to tensorflow. Your readme says that it is not completed, could you tell what is missing? If you are planning to complete it, could you tell me when?

Thanks again!

zhaolewen commented 7 years ago

Hi,

Thanks :) In my tests, the performance is EM 55% on the test set. What's missing is at least:

  1. Fixing most of the embedding and fine-tune the top 1000 words.
  2. Attention weighted matching between question and document(I'm getting NaN errors when I include it)
  3. Dropout on the word embedding

Then there are things to be verified:

  1. The Multilayer RNN in TF and Pytorch seems to be working differently
  2. Verify if I'm doing it correctly for the attention layers

What I'll not do

  1. Provide so many options as is in the Facebook/@hitvoice version, I'll try to implement the version with the best performance
  2. Summary recording with Tensorboard. I'm beginning to use Elasticsearch and Kibana, it turns out to be easy and quite nice.

And for the performance, I plan to get it to at least EM 65% in the following month, but maybe not to rival the original implementation in the article.

You are looking at this model as well ?

AndreCI commented 7 years ago

Hey,

Thanks for your detailled answer! I'm planning to implement it on tensorflow, and your version is a really good baseline.

I'll submit issues if I see them! Best,

AndreCI commented 7 years ago

Hi again,

Concerning your NaN problem, it seems that it comes from the tf.exp(alpha_flat), as it returns inf values. I fixed it by imposing stddev=0.1 to the weights for the ReLU, as the values are not that big and thus exp doesn't return inf.

W = tf.Variable(tf.random_normal(shape=[input_size, input_size], mean=0.0, stddev=0.1, dtype=tf.float32, name='ReLU_weight'))

AndreCI commented 7 years ago

I have some questions if you have the time to answer them.

Sorry to bother you with all of theses questions!! Thanks again for your code, it helped me quite a lot!

zhaolewen commented 7 years ago

Hi, Great that you've looked at the issue with NaN. Besides,

No problem for that ! Great that it's helping you. I find it extremely helpful for me to re-write the Pytorch version to Tensorflow, I was looking for an opportunity to understand how Attention works but the articles and schemas on the internet are still far from actually implementing it and understanding the math behind it.

AndreCI commented 7 years ago

Hi,

Thanks for your answer!

I agree that it is really helpful to re write stuff! I started on attention mechanisms with Dynamic Memory Module but I still have some trouble to prove that the math behind does work. this paper is what I focused on https://arxiv.org/abs/1506.07285, in p. 8 there is some graph showing the attention mechanism in action.

zhaolewen commented 7 years ago

Hi, I've finally finished some personal stuff and free to come back to this project. Besides,

And I really have a feeling that the Attention part is not right, it doesn't seem to be something similar to what is described here: https://distill.pub/2016/augmented-rnns/ Yeah, I've seen your article somewhere, too. There are so many interesting articles...

developeratdaguanyuan commented 7 years ago

Hi, Recently I was implementing DrQA in tensorflow, but EM in testing is 56%. Is there any hack technology inside? Btw, I found it costs 50 min for a single epoch in tensorflow and it is quite slow. Thanks

zhaolewen commented 7 years ago

Hi, Indeed, I can also only achieve about 56% for EM, it's really strange, that's why I've marked this project as "in progress", I wanted to at least achieve 60+% for EM.

How are you implementing your version ? I am converting the Pytorch version to Tensorflow, and I think I've got most of the technical details right ... I' not clear on why it's only 56% for EM....

By the way, yes, it's slow for me as well. But I've also read some comparison between Tensorflow and Pytorch, and TF doesn't seem to be that slow, well ... so I'm still sticking with TF

developeratdaguanyuan commented 7 years ago

Hi, My version is still vanilla, eg: no fixing most of the embedding and fine-tune the top 1000 words. It is a little weird.

zhaolewen commented 7 years ago

Yeah, 2 months ago I thought the 56 % is because I don't have question-document aligned embedding, or the fine tuning of 100 words, then I fixed the issue for the RNN, and added the alignment embedding, it's still around 56%

developeratdaguanyuan commented 6 years ago

I think it is better to use DrQA's way to clean training data. In the data, some answers are from the middle of a word instead of begin of a word.