Open RujingD opened 1 year ago
Hi, since this kind of VQ-stuff is not easy to converge and sensitive to the dataset/task/representation, hypter-parameters, and network architecture (no. of transformer layers, heads...). You may need to modify them to make it work on a customized dataset.
Hello! We are running on our own data (the vertex data is not 5023*3, but 169-dimensional). The loss in the first stage decreases normally, but the second stage does not converge at all. What is the reason?