Open yaseryacoob opened 6 months ago
Hey @yaseryacoob nice observation. Yes we train on 224x224 and havent yet explored generalization across input resolutions. We'll soon make the training code public once we work out a few remaining kinks and clean up the APIs so that you can train versions that are better suited to higher resolutions. Thanks for your patience
Thanks for the update. As I wait. Can you please indicate what training time and resources were needed? For example, if one takes a standard dataset like CELEBHQ or FFHQ, with 30-70K images of 1Kx1K, would it even be feasible to train given your experience?
This is not meant to detract from your work, it is useful as is.
Thanks for the update. As I wait. Can you please indicate what training time and resources were needed? For example, if one takes a standard dataset like CELEBHQ or FFHQ, with 30-70K images of 1Kx1K, would it even be feasible to train given your experience?
This is not meant to detract from your work, it is useful as is.
can you share your code for trying to get the higher resolution? Thanks
I am using the dinvo2 upsampling and noticed serious degradation as resolution increases (for example at low and medium resolution upto 784 it reasonable, but once I use 1K or 1200 the result is weak. My guess is that since you trained on relatively low resolution of 224x224 the model doesn't scale up. Here is a simple example, you can see some kind of wave come through the upsampling. Any insights or solutions.