Closed lcxxxasd closed 1 year ago
Thank you for pointing out the mistake in our paper.
We experimented with both the joint training and the pipeline fashion. It turned out that the pipeline way was more efficient and competitive, so we finally chose it for all the experiments. The illustration about this part in the appendix of the paper is mistaken. We will revise it and upload a new version at arxiv.
Hello, the training procedure part of the paper says that ssg and utt decoder are trained joint, but they seem to be trained independently according to the code