Closed alex4727 closed 1 year ago
I did not try bipartite inversion with SD so I can't comment on it. We did do plenty of tests with PTI, and it usually worked better than just raw inversion.
What code did you use for the bipartite inversion? I don't think we uploaded our version?
Oh I was confused :)
Didn't see the word "additional" in appendix so thought raw config meant bipartite.
Anyway, haven't trained them for 5k iter, but seems like raw inversion is converging faster for me.
Does PTI require more iterations than raw?
So you think PTI eventually ends up with better results at the end??
I would expect PTI to take a bit longer to start converging, yes, since you're training with much lower learning rates. You can try to bump the model_lr to 1e-6 and lower the base_lr to something like 1e-5 if you want things to converge a bit faster.
Thanks!
Thanks for such great work! I'm just curious about the overall performance of between these two on SD. Few tries, but feeling like naive bipartite inversion (not the unfrozen one) does better with SD for me. Any experiences?