abhishekkrthakur / is_that_a_duplicate_quora_question

441 stars 178 forks source link

Not the final code? #14

Open LanceNorskog opened 4 years ago

LanceNorskog commented 4 years ago

This does not look to be the final version of your code.

The features file is created, but is not used by deepnet.py, for example.

Also, I've got deepnet.py generally running, but the validation loss drifts upward with accuracy. This is a mark of a model that is not big enough to store its info, and the data has high variance, so the model overtrains.

I just noticed that the data is not shuffled during prep.

abhishekkrthakur commented 4 years ago

deepnet.py is the final code that can be used to achieve the accuracy reported at the time of writing the article.

features file is only for reference on how extra features can be created.

I have only shared the best model. Shuffling didnt bring much value when I tried it 3 years ago but now i dont remember all the details :)