Closed cvalenzuela closed 3 years ago
@cvalenzuela, I trained on https://github.com/affinelayer/pix2pix-tensorflow/blob/master/pix2pix.py. I converted to tf.js but I faced some errors. So, I created a similar keras model and ported the weights directly. and then converted the model. To make predictions faster I used lesser number of filters for each layer.
Aha! So you are training just using your custom model? And the weights exports is using the regular tensorflowjs_converter
?
Here, I explained the whole process. So the pipeline is like this
Train on Tenosrflow -> export the model as graph and weights->create keras model-> load the weights to the keras model -> convert the keras model to tf.js
This is amazing! @zaidalyafeai. A lot faster!
The pix2pix in ml5.js is like something like this: It is trained on Tenosrflow -> in tf.js, load the weights, and create the model in tf.js.
I guess having a keras model and converting it to tf.js will make things faster?
@yining1023, it will be interesting if this is the case. In Tensorflow I used ngf= ndf = 32
for faster predictions. If you used the default 64 the model will be bigger and predictions slower.
@zaidalyafeai That makes sense. Yes, I used the default 64. I will try to use ngf= ndf = 32
and see if the model will be smaller and faster. Will keep you posted!
@zaidalyafeai Thanks for the heads up! I used tf.batchNormalization
in here. I created the model in tf.js instead of using keras though.
I'm training a model with ngf= ndf = 32
now.
@yining1023, oh I understand. You are using a completely different approach. BTW, we could make the model even faster by using depthwise separable convolutions. It was used on mobilenet for faster predictions.
@zaidalyafeai, I retrained the model with --ngf 32 --ndf 32
. And I got a much smaller model(13.6MB vs. 50MB), and it predicts faster. It took me 01:55:57 vs. 04:16:21!
This is the new model: code
https://yining1023.github.io/pix2pix_tensorflowjs_lite/
This is the old model:
https://yining1023.github.io/pix2pix_tensorflowjs
I also tested the two models with some testing images, the outputs are similar. Left: outputs from the new model Right: outputs from the old model
This works well. Thank you so much for your help!! @zaidalyafeai. I haven't tried using tf.depthwiseConv2d yet. Will try it later.
@cvalenzuela I will update the model in the ml5-examples and ml5-website soon!
@yining1023, that is really cool. I tried tf.depthwiseConv2d but it didn't get me good results. The generated images were a bit blurry. I guess, this is because the number of parameters is too small (around 1.3 mil !).
looks great @yining1023. Besides updating the model, maybe we should work on using @zaidalyafeai implementation. I'll look into it
I have an idea but not sure if that would work. We could create a universal model that work on 10 classes for instance. The user doesnt have to switch between models to try different implementations. That might create a style transfer effect as well.
that would be nice to have as an experiment! the models should be small enough
This implementation looks really cool
https://twitter.com/bgondouin/status/838658995351085056?s=19
Good idea! It will be cool to play with several different models on the same page. We should totally add it to the ml5 website.
I think the outputs from the two implementations are similar. They both can predict new images while you drawing in real-time. (@zaidalyafeai's cat model feels a little faster to me though). The model is ~13MB now. We could reduce the parameters even more, but we'll have to balance the prdicting speed with the quality of the result.
Both implementations use the same Tensorflow script to train, after training, one creates a keras model and converts the keras model to tf.js, the other one creates a model in tf.js.
I think we could try using tf.depthwiseConv2d, or separableConv2d like @nsthorat suggested before.
@yining1023, I have read on twitter that is best practice to avoid using tf.depthwiseConv2d
for the initial layers. Typically, we need to conserve the correlation for the rgb colors in the first layer. Maybe we should only use them only on the decoder layers.
BTW, datasets with instance segmentation gives much better results. For instance, https://zaidalyafeai.github.io/pix2pix/faceSeg.html. This is because we give the network an extra condition (color). There are tools which provide automatic instance segmentation like this.
Hi @yining1023 @zaidalyafeai is there any remaining work required for this issue? Curious if this issue is ready to close!
Closing this issue for now, but please feel free to reopen if there are other changes we want to discuss here!
Following a conversation with @zaidalyafeai, it might be a good idea to change the current pix2pix version to his own implementation which runs faster: https://zaidalyafeai.github.io/pix2pix/cats.html