lltcggie / waifu2x-caffe

waifu2xのCaffe版
MIT License
8.04k stars 839 forks source link

about increasing waifu2x-caffe speed #18

Open liunkily opened 8 years ago

liunkily commented 8 years ago

about someone have 2 or more gpus working at the same time ,I think waifu2x-caffe can work this way. The image is divided into two parts. and gpus work separately. last waifu2x-caffe combine the 2 parts images and output.

lltcggie commented 8 years ago

一応検討してみますが複数GPU対応はこちらの様々な事情が絡んでくるのであまり期待しないでください…

liunkily commented 8 years ago

I see . if dispart 2 parts , Image Edge (the Combination of 2 images…) will becom ugly.need 2 images(image parts) larger than 1/2 of the original image (copy original image, than crop it larger than 1/2.another image do as the same ) finally,crop 2 image's edge and combine . it could remove above problem. last but not lesast,I'm not good at Japanese,maybe misunderstand your words.

OK,this is only my idea,I don't understand specific implementation of many gpus work.Thank you for working.

chungexcy commented 8 years ago

cuDNN v5 RC has been released. 3x3 convolution may have some improvements.

leilei- commented 8 years ago

Maybe integrating and updating the older GLSL port would help those who can't use CUDA (e.g. AMD video hardware users)? https://github.com/ueshita/waifu2x-converter-glsl

(which only has a Y model I believe)

chungexcy commented 8 years ago

For people who cannot do CUDA, there is a well-implemented version that supports OpenCL on AMD GPU. While the model needs to be updated from the original waifu2x by nagadomi. https://github.com/tanakamura/waifu2x-converter-cpp

lltcggie commented 8 years ago

https://twitter.com/ultraistter/status/720659986108420097 waifu2xの作者さんのツイートにもありましたが、Winogradを使うと速くなるどころかむしろ遅くなってしまいました。 別に速くなるわけでもないし、waifu2x-caffeに組み込むにはデバッグの時間が足りないのでcuDNN v5 RCの対応はしばらくお待ち下さい。

MaverickTse commented 7 years ago

Since I saw quite a lot of users are using waifu2x for Video or Image sequence, is it possible to:

cache source and upsized tiles for image N ↓ read image N+1 ↓ look for changed tile in N+1 ↓ reuse cached upsized tile for N and upsize only changed tiles in N+1 ↓ update cache

GrayWolfson commented 7 years ago

Whether it is possible to use in waifu2x-caffe library from Intel (clDNN) https://01.org/cldnn for an acceleration of calculations?