Question: How do I use TensorNetworkLayer to replace Dense Layers on Trained Model w/o re-train?

google / TensorNetwork

A library for easy and efficient manipulation of tensor networks.

Apache License 2.0

1.82k stars 359 forks source link

Question: How do I use TensorNetworkLayer to replace Dense Layers on Trained Model w/o re-train? #911

Open shun-lin opened 3 years ago

shun-lin commented 3 years ago

Hi!

I have a saved_model with a few fully-connected layers (fc_layers), is it possible to transform the weights already learned from those fc_layers to the weights of tn_layers? Or do I have to retrain those tn_layers while freezing other layers? Thanks!

Sincerely, Shun Lin

mganahl commented 3 years ago

Hi @shun-lin sorry for the late reply! Let me rope in @jacksonwb on this!

jacksonwb commented 3 years ago

Hi @shun-lin you can extract the weights from a fc_layer in your saved_model, perform a decomposition and apply the properly decomposed weights to a tn_layer with the standard keras set_weights method. From there you can perform inference or fine tuning. However we do not currently have an implementation of decomposition methods to each layer kernel from a dense layer so you would have to do it manually. But that would be a nice addition!

Doing the decomposition manually can be relatively straight forward or somewhat complex depending on the layer shape you are decomposing to. I would probably recommend you freeze the other layers and train from scratch, unless you are particularly interested in approximating layers specifically trained as fully-connected.

shun-lin commented 3 years ago

Hi @jacksonwb,

Thanks for the quick response! I have a few follow-up questions:

1) Where can I find some references to apply the decomposed weights? Do you think setting decomposed weights on the converted tn_layer from the original fc_layer can coverage faster for fine-tuning than freeze other layers than train from scratch?

I think i'm interested in knowing the answer to this in the lens of AutoML to try to cut down computation costs / model search time and some techniques uses transfer learning :)

jacksonwb commented 3 years ago

If the fc_layer in question is really large decomposing into a tn_layer may very well be faster than training that layer from scratch. Particularly for more complex tn structures where the backprop gets more expensive. Decomposing the weights can be done using the TensorNetwork Library, but the details about bond dimension to use, and how to efficiently do the decomposition for a particular topology would have to be manually implemented. Once you have np or tf arrays representing the decomposed weights you can just apply them with the standard keras API: https://keras.io/api/layers/base_layer/#set_weights-method