ml5js / ml5-library

Friendly machine learning for the web! 🤖
https://ml5js.org
Other
6.5k stars 901 forks source link

Pix2pix improvements #198

Closed cvalenzuela closed 3 years ago

cvalenzuela commented 6 years ago

Following a conversation with @zaidalyafeai, it might be a good idea to change the current pix2pix version to his own implementation which runs faster: https://zaidalyafeai.github.io/pix2pix/cats.html

zaidalyafeai commented 6 years ago

@cvalenzuela, I trained on https://github.com/affinelayer/pix2pix-tensorflow/blob/master/pix2pix.py. I converted to tf.js but I faced some errors. So, I created a similar keras model and ported the weights directly. and then converted the model. To make predictions faster I used lesser number of filters for each layer.

cvalenzuela commented 6 years ago

Aha! So you are training just using your custom model? And the weights exports is using the regular tensorflowjs_converter?

zaidalyafeai commented 6 years ago

Here, I explained the whole process. So the pipeline is like this

Train on Tenosrflow -> export the model as graph and weights->create keras model-> load the weights to the keras model -> convert the keras model to tf.js

yining1023 commented 6 years ago

This is amazing! @zaidalyafeai. A lot faster!

The pix2pix in ml5.js is like something like this: It is trained on Tenosrflow -> in tf.js, load the weights, and create the model in tf.js.

I guess having a keras model and converting it to tf.js will make things faster?

zaidalyafeai commented 6 years ago

@yining1023, it will be interesting if this is the case. In Tensorflow I used ngf= ndf = 32 for faster predictions. If you used the default 64 the model will be bigger and predictions slower.

yining1023 commented 6 years ago

@zaidalyafeai That makes sense. Yes, I used the default 64. I will try to use ngf= ndf = 32 and see if the model will be smaller and faster. Will keep you posted!

zaidalyafeai commented 6 years ago

@yining1023, BTW, are you using BatchNormalization with training = True ? tf.js will calculate predictions wrongly. I filed an issue a week ago. It has not been fixed so I sourced my own bundle of the code with predictions calculated correctly.

yining1023 commented 6 years ago

@zaidalyafeai Thanks for the heads up! I used tf.batchNormalization in here. I created the model in tf.js instead of using keras though.

I'm training a model with ngf= ndf = 32 now.

zaidalyafeai commented 6 years ago

@yining1023, oh I understand. You are using a completely different approach. BTW, we could make the model even faster by using depthwise separable convolutions. It was used on mobilenet for faster predictions.

yining1023 commented 6 years ago

@zaidalyafeai, I retrained the model with --ngf 32 --ndf 32. And I got a much smaller model(13.6MB vs. 50MB), and it predicts faster. It took me 01:55:57 vs. 04:16:21!

This is the new model: code https://yining1023.github.io/pix2pix_tensorflowjs_lite/
This is the old model: https://yining1023.github.io/pix2pix_tensorflowjs

I also tested the two models with some testing images, the outputs are similar. Left: outputs from the new model Right: outputs from the old model oldmodelnewmodel

This works well. Thank you so much for your help!! @zaidalyafeai. I haven't tried using tf.depthwiseConv2d yet. Will try it later.

@cvalenzuela I will update the model in the ml5-examples and ml5-website soon!

zaidalyafeai commented 6 years ago

@yining1023, that is really cool. I tried tf.depthwiseConv2d but it didn't get me good results. The generated images were a bit blurry. I guess, this is because the number of parameters is too small (around 1.3 mil !).

cvalenzuela commented 6 years ago

looks great @yining1023. Besides updating the model, maybe we should work on using @zaidalyafeai implementation. I'll look into it

zaidalyafeai commented 6 years ago

I have an idea but not sure if that would work. We could create a universal model that work on 10 classes for instance. The user doesnt have to switch between models to try different implementations. That might create a style transfer effect as well.

cvalenzuela commented 6 years ago

that would be nice to have as an experiment! the models should be small enough

zaidalyafeai commented 6 years ago

This implementation looks really cool

https://twitter.com/bgondouin/status/838658995351085056?s=19

yining1023 commented 6 years ago

Good idea! It will be cool to play with several different models on the same page. We should totally add it to the ml5 website.

I think the outputs from the two implementations are similar. They both can predict new images while you drawing in real-time. (@zaidalyafeai's cat model feels a little faster to me though). The model is ~13MB now. We could reduce the parameters even more, but we'll have to balance the prdicting speed with the quality of the result.

Both implementations use the same Tensorflow script to train, after training, one creates a keras model and converts the keras model to tf.js, the other one creates a model in tf.js.

I think we could try using tf.depthwiseConv2d, or separableConv2d like @nsthorat suggested before.

zaidalyafeai commented 6 years ago

@yining1023, I have read on twitter that is best practice to avoid using tf.depthwiseConv2d for the initial layers. Typically, we need to conserve the correlation for the rgb colors in the first layer. Maybe we should only use them only on the decoder layers.

zaidalyafeai commented 6 years ago

BTW, datasets with instance segmentation gives much better results. For instance, https://zaidalyafeai.github.io/pix2pix/faceSeg.html. This is because we give the network an extra condition (color). There are tools which provide automatic instance segmentation like this.

bomanimc commented 4 years ago

Hi @yining1023 @zaidalyafeai is there any remaining work required for this issue? Curious if this issue is ready to close!

bomanimc commented 3 years ago

Closing this issue for now, but please feel free to reopen if there are other changes we want to discuss here!