Would you please tell me are the three models(conv7、cunet、swin) trained with the same training Dataset~

nagadomi / nunif

Misc; latest version of waifu2x; 2D video to stereo 3D video conversion

MIT License

1.55k stars 141 forks source link

Would you please tell me are the three models(conv7、cunet、swin) trained with the same training Dataset~ #6

Closed kimwao closed 1 year ago

kimwao commented 1 year ago

Thanks for your excellent work! I want to verify if the increase in the network's parameters can improve performance, so I need to determine if they were trained from the same dataset. I would be very grateful if you tell me!

nagadomi commented 1 year ago

About pretrained models,

cunet/ and upconv_7/ are converted from the original waifu2x pretrained model to pytorch. In other words, they are trained with the original waifu2x code and the dataset at the time of their release. The reason is for compatibility. Only swin_unet/ is trained with this repository and the latest dataset.

I have also tried to train upconv_7 and upcunet with this repository and the latest dataset. (These are not included in the release. Also, I have changed the code even in the last few days so it is not exactly the same.)

Benchmark result python3 -m waifu2x.benchmark --model-dir ${MODEL_DIR} --method scale -i /data/dataset/eval --filter catrom --color y_matlab (No TTA)

arch	trained with nunif	trained with the original waifu2x
upconv_7	37.6632	37.0478
upcunet	39.9458	39.458
swin_unet_2x	40.3235	N/A

xurich-xulaco commented 1 year ago

I don't know if someone else already asked this or if I just got lost in the documentation, but it seems that it is not easy to find the new database and the differences between the original and the updated one, is there a quick link to verify this?

nagadomi commented 1 year ago

I use hybrid of self crawled dataset and synthetic dataset(font and dot patterns), which I change from time to time on a whim, and I have never described it in detail.

ladichanjp commented 1 year ago

upconv_7, upcunet, swin_unet_2x are .pth? or can they be ported? do they work like unlimited:waifu2x? waifu2x.udp? waifu2x could be the best upscaler for anime, better than esgran anime-6b, animevideo or v3, waifuxl did this claim 1 year ago : Real-ESRGAN will outperform the models used on [waifu2x], but i think its not true, your upscalers with Deep Convolutional Neural Networks have better potential, upscaling and upscalesr are becoming the most important part of artificial generated images, like swinir, esgran,scunet,hat, you can load the .pth to upscale encode/decode/vae , pkuliyi2015, multidiffusion extension is a good example, you can create an extension for tools like automatic1111, or release arch checkpoints with nuninf/waifu2x fo testing, yu45020/Waifu2x checkpointsare outdated, antonpaquin/waifu2x-cunet-pytorch) outdated too, and emulations waifu2x.pth dont work well, i really think this upscaler is the future.

xurich-xulaco commented 1 year ago

@ladichanjp The closest thing to a compendium of anime upscaling is vs-mlrt, where they have from the traditional waifu2x cunet and upconv 7 and Real ESRGAN, to the recent waifu2x swin unet and Real CUGAN. The way they are compiled is through ports to ONNX, which has allowed them to use traditional CUDA acceleration, to take advantage of vulkan or rt cores. The big disadvantage is that you must at least know a little python and vapoursynth, which are the basis of this whole operation.

ladichanjp commented 1 year ago

@ladichanjp The closest thing to a compendium of anime upscaling is vs-mlrt, where they have from the traditional waifu2x cunet and upconv 7 and Real ESRGAN, to the recent waifu2x swin unet and Real CUGAN. The way they are compiled is through ports to ONNX, which has allowed them to use traditional CUDA acceleration, to take advantage of vulkan or rt cores. The big disadvantage is that you must at least know a little python and vapoursynth, which are the basis of this whole operation.

thank you so much, this is very intersting, automatic1111 Onnx Pipeline is not ready, is there a way to port that to a pth format? VapourSynth looks amazing, its going to be essential to upscale txt2video

xurich-xulaco commented 1 year ago

@ladichanjp I realized I forgot that there was already a GUI front end for this whole vs mlrt if you are interested, enhancr About going to the original pth, I think you might have to dig into every project. Regarding waifu2x, nagadomi already offers pre trained models which are already on the pth format you are looking for. But, unless you are already onto python and messing with ai libraries, I think you should be safe with enhancr.

ladichanjp commented 1 year ago

@ladichanjp I realized I forgot that there was already a GUI front end for this whole vs mlrt if you are interested, enhancr About going to the original pth, I think you might have to dig into every project. Regarding waifu2x, nagadomi already offers pre trained models which are already on the pth format you are looking for. But, unless you are already onto python and messing with ai libraries, I think you should be safe with enhancr.

thanks you so much xurich-xulaco and nagadomi, this will help me with my research

nagadomi commented 1 year ago

If you are using Stable Diffusion Web UI, you should be able to run Command Line Interface or Web UI of this repo. If you want it to work in Colab, see #9 .

Also, I am recently working on a model for photos. It is also designed for photorealistic generated art like Stable diffusion, so if it works out, I may develop an extension for sd. However, that does not mean right now, so if someone wants to make it, you can make it.