lanwuwei / SPM_toolkit

Neural network toolkit for sentence pair modeling.
303 stars 70 forks source link

about DecAtt #29

Open BruceLee66 opened 5 years ago

BruceLee66 commented 5 years ago

When i use this model for wikiQA Task,i found that the batch list is difficult. image image Why should we resort the length?And The interval of batch_list is not 32.

lanwuwei commented 5 years ago

DecAtt is very difficult to train, which I tried many ways to make it work, including gradient clipping, sorted length and etc. Previously people used length sorting to accelerate the model training and convergence speed, since the input doesn't vary a lot.