Open kundtx opened 1 year ago
G1 Haizhou Liu: Very solid technical innovations! Two small questions: 1) do the result statistics indicate accuracy (percent)? 2) How can the replacement of Bi-LSTMs with transformers "speed up" CRNN training?
@Prof-Greatfellow G1 Haizhou Liu: Very solid technical innovations! Two small questions: 1) do the result statistics indicate accuracy (percent)? 2) How can the replacement of Bi-LSTMs with transformers "speed up" CRNN training?
G23 Zhang Boyang: Thanks for your insightful questions! 1) Yep, the metric is word accuracy. Sorry for my unclear expression. 2) Attention method can improve the training efficiency because of parallel computation. By contrast, Bi-LSTMs process a word/sequence character by character, which is very time-consuming.
G29 Yuyan Wang: Good work! I wonder whether you use DTW Distance as the loss function, because it is discrete and has the problem of uncalculated gradient.
@yuyan12138 G29 Yuyan Wang: Good work! I wonder whether you use DTW Distance as the loss function, because it is discrete and has the problem of uncalculated gradient.
G23 Shiji Cao: Thanks for your question. Your concern is very valuable. So we use gamma-soft-dtw, an approximate method to replace the minimization operation. The equation is $\min ^\gamma {a_1, a_2, \ldots, an } =- \gamma \log \sum^n{i=1} e^{-\frac{a_i}{\gamma}}$ when $\gamma > 0$, =0 when $\gamma=0$.
http://8.129.175.102/lfd2022fall-poster-session/23.html