yusuketomoto / chainer-fast-neuralstyle

Chainer implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution".
MIT License
803 stars 229 forks source link

real world application idea #45

Open markzhong88 opened 8 years ago

markzhong88 commented 8 years ago

Hi guys, just start to play this fast neural style recently, it's amazing to get it work on my laptop with geforce 940M 2G GPU, although it's slow. I am thinking to turn this algorithm into real word application. Besides app like prisma, it seem we can easily create animation movie in the future with such technology. Any other thought about building real world product?

BTW, have fun with modeling and generating.

rogozin70 commented 8 years ago

I started the server. http://photofuneditor.com/prisma-dreamscope-filters Add your filters-models to the forum and I'll include them in the online processing.

markzhong88 commented 8 years ago

really cool, I wonder how you build the server, by using AWS GPU?

rogozin70 commented 8 years ago

GPU- gtx 970 - 4 gb.

mendyz commented 8 years ago

I just tested it, that's very fast! Well done. I'd love to have some more setup details, over been considering that gpu card, what are some tweaks you did and pitfalls ran into. Or is that seriously an out of the box result?

markzhong88 commented 8 years ago

Yeah, I tried it and it's fast. Very impressive that you only use gtx970 4g. Did you setup a home server, would like to know more detail about your backend setup.

6o6o commented 8 years ago

@markz-nyc So what's the application? Transforming videos is nothing new and has been done previously numerous times. Here's my recent attempt, which includes optical flow for frame blending https://youtu.be/h0jH0bJIvcM

rogozin70 commented 8 years ago

I have installed CUDA and CUDNN. Everything is done according to the instructions. @6o6o I noticed in your video new filters. You do not have share them?

6o6o commented 8 years ago

Yea, I can't, sorry. I don't own them, they were trained by other people and are used in an app.

markzhong88 commented 8 years ago

@rogozin70, I wonder how to turn that into that web application, I mean user upload a photo in the web and then get the stylized photo output, do you mind share more detail about setup such application, for example, did you rent a server or use your own machine to do the process?

rogozin70 commented 8 years ago

I think to buy your own home server will be more profitable. Prices are too high on Amazon.

nandexsp commented 8 years ago

hello @markz-nyc , I think like @rogozin70 , the server via AWS is too expensive and you could by your own server and compute inside.

markzhong88 commented 8 years ago

@nandexsp, Yeah I am currently working on setting up my own server, I have 0 server side knowledge, so I am doing research about framework like django and flask now.

mxchinegod commented 8 years ago

@markz-nyc I made exactly what you're talking about in my repository.

https://github.com/DylanAlloy/NeuralStyle-WebApp

I also disagree with what people are saying about AWS. I'm using a compute instance as we speak and it works just fine for 1 dollar per 3 hours. Not a bad price if you're making money with it or if you want to continue using it.

Additionally there are cheaper instances but they will take longer to process an image.

Building your own server is bad advice for a long-term solution as you will have 0 options for load balancing a popular server. The best you can do is buy multiple computers for different tasks like training, hosting, computing, etc.

rogozin70 commented 8 years ago

aws 1$ - 3 hours. 1 day - 8$, 1 month - 240 $. Cost home server (icore3 +gtx 970) = 800$ 3 months = 1 server

mxchinegod commented 8 years ago

@rogozin70 Completely unscalable, runs off of your own internet connection which will bottleneck everything else, will eventually destroy your router unless you buy a good one, you might as well buy business internet, etc.

If you're looking to host a website that a lot of people are going to use to run an algorithm that can take up to 30GB of RAM, you're going to need more than a home server. You should start with 16GB for medium sized cell phone photos (unless you don't mind being limited to 1 500x500 photo at a time with a 970). My code in the repository I linked to will resize images to 500x500 by default to mitigate this.

capture

This x 10 people on a home server? I don't think so. It takes 2GB for a 500x500px photo btw. 2GBx10 = about 130 dollars in RAM (and it still won't be good enough later)

A server from Amazon good enough to do one image at a time is about .1 dollar per hour. 2.4 dollars a day, and can be scaled. It takes about 1 minute to do a 500x500 image with such a server and would be easy to scale using AWS. You could sell adspace that makes you money at that rate.

markzhong88 commented 8 years ago

right now, it seems take me 27 hours to train a model, my room temp is too high, seems I have to move the machine into the fridge, lol

markzhong88 commented 8 years ago

@DylanAlloy nice repo, I will take a look when I get a chance. Yeah, AWS is a better choice considering it as real product rather than hobby.

nandexsp commented 8 years ago

You have a niche repo @DylanAlloy , are you training with epoch 2, or higher??

mxchinegod commented 8 years ago

@nandexsp @markz-nyc thanks, yeah I am using 2 epochs. Right now it's taking 73GB of RAM to generate a 4K image (still climbing). I'm hoping it doesn't go over 122GB, that's all I have for now...

nandexsp commented 8 years ago

Great Work man!

markzhong88 commented 8 years ago

Prisma already made the neural style into offline mode, how iPhone can get such good result in few seconds without using gpu?

mxchinegod commented 8 years ago

Because it's based on C code probably rather than Python.

xpeng commented 8 years ago

@DylanAlloy wow, great work! you generated a such large size image! does the style or style scale been applied looks as similar with small size image for example 720pixel?

6o6o commented 8 years ago

@DylanAlloy I think it's good-old python. The key action here is downsampling. Original implementation already does this twice using strided convolutions. So, when you process a 512x512 image, the actual transformation is applied to 128x128 image. I guess Prisma took this a bit further by increasing stride or adding another pair of convolution / deconvolution layers, which downsample more, saving on memory consumption, increasing computation speed and lastly, enlarging style features.

markzhong88 commented 8 years ago

@6o6o Do you want to do this as freelance job by helping me setup those up?

6o6o commented 8 years ago

Thanks, I'm already busy trying various experiments on anything i can think of. I'm not really an expert in the field, so it's largely trial and error way and most of them end up futile. Nevertheless, I'm super excited about the technology and will keep you guys informed on any success stories, if any occur :)

rupeshs commented 8 years ago

Check out my NeuralStyler app Turn Your Videos/Photos/GIF into Art http://neuralstyler.com/ https://github.com/rupeshs/neuralstyler

nandexsp commented 8 years ago

Its a good app @rupeshs , you could checks apps like prisma or pixify. The last one has video in iOS and more customize options.