Hello:
I used a similar diffusion transformer policy and trained on the calvin dataset, but the noise loss of the action could only drop from about 1 to about 0.2, and could not continue to decrease.
I read in the paper that the loss can be lower than 0.004.
Is the model I trained underfitting?
Have you encountered similar problems?
Hello: I used a similar diffusion transformer policy and trained on the calvin dataset, but the noise loss of the action could only drop from about 1 to about 0.2, and could not continue to decrease. I read in the paper that the loss can be lower than 0.004. Is the model I trained underfitting? Have you encountered similar problems?