JianshuZhang / WAP

Watch, Attend and Parse for Handwritten Mathematical Expression Recognition
250 stars 82 forks source link

Questions about WAP #19

Open bilal2vec opened 5 years ago

bilal2vec commented 5 years ago

Hi,

I was reimplementing WAP and had some questions.

  1. Did you try using a resnet based encoder instead of using a vgg or densenet?
  2. Did you calculate the BLEU score as an additional metric?
  3. What are the train and val loss of the models after training?
  4. Looking at your data iterator code, it looks like you don't resize images but group together images with the same size into a batch. Is this correct?
  5. In issue https://github.com/JianshuZhang/WAP/issues/8#issuecomment-392633715, you wrote that the 2013 ground truth latex labels aren't normalized. How much did the exprate decrease when training and validating on un-normalized data?
  6. When training, do you use the ground truth label of the previous timestep as the input to the model at the current timestep (teacher forcing), or do you instead use the model's prediction at the previous timestep?

Thanks

JianshuZhang commented 5 years ago

sorry for the late.

  1. densenet is best for CROHM dataset, if your training set is big enough, maybe resnet is better.
  2. I didn't try BLEU.
  3. You can see the training log.
  4. Yes, I didn't resize images.
  5. If your training latex and test latex are mis-match, the exprate will decrease a lot.
  6. use the ground truth label of the previous timestep as the input
bilal2vec commented 5 years ago

Hi, thanks for the response

I tried training another WAP implementation (https://github.com/menglin0320/wap-on-tensor) on the new 2019 crohme dataset (https://www.cs.rit.edu/~crohme2019/task.html) but couldn't get the model to generalize to the new out of distribution validation set. Did you also experience problems with your model generalizing to the 2013 test set? If so, how were you able to solve that problem?

Thanks

JianshuZhang commented 5 years ago

Recently, we also take part in the CROHME2019 competition. Our model can be tested on CROHME 2013, 2014, 2016 dataset. So, I didn't meet your problems.

Zhang-O commented 5 years ago

@JianshuZhang 1.densenet is best for CROHM dataset, if your training set is big enough, maybe resnet is better. how many pics will satisfy the ' big enough '? 20k will ok?

JianshuZhang commented 5 years ago

Yes, the encoder depends on your task, it would be easy to change the encoder part. For densenet on CROHME dataset, less than 40k will be OK.

在 2019年8月28日,上午10:57,Zhang notifications@github.com 写道:

@JianshuZhang https://github.com/JianshuZhang 1.densenet is best for CROHM dataset, if your training set is big enough, maybe resnet is better. how many pics will satisfy the ' big enough '? 20k will ok?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JianshuZhang/WAP/issues/19?email_source=notifications&email_token=AETNJULEWNAHHXUKJZQESNTQGXSS3A5CNFSM4HCZEIX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5JXKVY#issuecomment-525563223, or mute the thread https://github.com/notifications/unsubscribe-auth/AETNJUNGYUZB5XX2VSOHGY3QGXSS3ANCNFSM4HCZEIXQ.

Zhang-O commented 5 years ago

thank you for your reply. if I want to change the backbone to resnet, pics num should be more than 40k.Is my understanding right?