Bihaqo / t3f

Tensor Train decomposition on TensorFlow
https://t3f.readthedocs.io/en/latest/index.html
MIT License
219 stars 55 forks source link

How to use TTLayer using native tensorflow? #122

Open cw-plus opened 6 years ago

cw-plus commented 6 years ago

Hello, I use Tensorflow not keras. Unfortunately, when I replace fullconnected layer with TTDense ,I get the message : Traceback (most recent call last): File "mnist-tflayers.py", line 140, in <module> launch_train_with_config(config, SimpleTrainer()) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/train/interface.py", line 91, in launch_train_with_config extra_callbacks=config.extra_callbacks) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/train/base.py", line 331, in train_with_defaults steps_per_epoch, starting_epoch, max_epoch) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/train/base.py", line 301, in train self.setup_callbacks(callbacks, monitors) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/utils/argtools.py", line 182, in wrapper return func(*args, **kwargs) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/train/base.py", line 211, in setup_callbacks self._callbacks.setup_graph(weakref.proxy(self)) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/callbacks/base.py", line 52, in setup_graph self._setup_graph() File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/callbacks/group.py", line 66, in _setup_graph cb.setup_graph(self.trainer) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/callbacks/base.py", line 52, in setup_graph self._setup_graph() File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/callbacks/inference_runner.py", line 142, in _setup_graph self._input_source, self.trainer.tower_func) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/graph_builder/predict.py", line 49, in build return tower_fn(*inputs) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/tfutils/tower.py", line 207, in __call__ output = self._tower_fn(*args) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/graph_builder/model_desc.py", line 234, in _build_graph_get_cost ret = self.build_graph(*inputs) File "mnist-tflayers.py", line 54, in build_graph logits = TTDense(row_dims=[512, 1, 1, 1], column_dims=[1, 2, 5, 1], tt_rank=16)(l) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/keras/engine/topology.py", line 592, in __call__ self.build(input_shapes[0]) File "/home/wc/ResNet/ok/utils/tt_dense.py", line 60, in build self.W = t3f.get_variable(name, initializer=initializer) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/t3f/variables.py", line 73, in get_variable 'set reuse=None in VarScope?' % name) ValueError: ValueError: Variable tt_dense_matrix_1 does not exist, or was not created with t3f.get_tt_variable(). Did you mean to set reuse=None in VarScope? what I should do to correct code ? My implement is here :link thanks a lot.

Bihaqo commented 6 years ago

Hi, can you please provide the code snippet that reproduces the problem?

ср, 28 мар. 2018 г., 9:38 ChaoWangHS notifications@github.com:

Hello, I use Tensorflow not keras. Unfortunately, when I replace fullconnected layer with TTDense ,I get the message : Traceback (most recent call last): File "mnist-tflayers.py", line 140, in

launch_train_with_config(config, SimpleTrainer()) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/train/interface.py", line 91, in launch_train_with_config extra_callbacks=config.extra_callbacks) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/train/base.py", line 331, in train_with_defaults steps_per_epoch, starting_epoch, max_epoch) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/train/base.py", line 301, in train self.setup_callbacks(callbacks, monitors) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/utils/argtools.py", line 182, in wrapper return func(*args, **kwargs) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/train/base.py", line 211, in setup_callbacks self._callbacks.setup_graph(weakref.proxy(self)) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/callbacks/base.py", line 52, in setup_graph self._setup_graph() File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/callbacks/group.py", line 66, in _setup_graph cb.setup_graph(self.trainer) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/callbacks/base.py", line 52, in setup_graph self._setup_graph() File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/callbacks/inference_runner.py", line 142, in _setup_graph self._input_source, self.trainer.tower_func) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/graph_builder/predict.py", line 49, in build return tower_fn(*inputs) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/tfutils/tower.py", line 207, in __call__ output = self._tower_fn(*args) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/tensorpack/graph_builder/model_desc.py", line 234, in _build_graph_get_cost ret = self.build_graph(*inputs) File "mnist-tflayers.py", line 54, in build_graph logits = TTDense(row_dims=[512, 1, 1, 1], column_dims=[1, 2, 5, 1], tt_rank=16)(l) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/keras/engine/topology.py", line 592, in __call__ self.build(input_shapes[0]) File "/home/wc/ResNet/ok/utils/tt_dense.py", line 60, in build self.W = t3f.get_variable(name, initializer=initializer) File "/home/wc/tfpy27/local/lib/python2.7/site-packages/t3f/variables.py", line 73, in get_variable 'set reuse=None in VarScope?' % name) ValueError: ValueError: Variable tt_dense_matrix_1 does not exist, or was not created with t3f.get_tt_variable(). Did you mean to set reuse=None in VarScope? what I should do to correct code ? My implement is here https://github.com/ChaoWangHS/TTlayer/blob/master/TTLayer/mnist-tflayers.py thanks a lot. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread .
cw-plus commented 6 years ago

Here link is. Thanks

Bihaqo commented 6 years ago

Seems to work for me: https://nbviewer.jupyter.org/urls/dl.dropbox.com/s/9dnotx52gxnmn3p/reproduce.ipynb

Although I don't have tensorpack library and excluded it.

A small advice: you won't get any compression unless you split the large 512 shape, e.g.: row_dims=[4, 4, 8, 4], column_dims=[1, 2, 5, 1];

Also, note that the TT-layer depends on the order of inputs/outputs. It may not be the problem inside the network, but if the TT layer maybe if the last one it can cause problems because of the order of labels.

cw-plus commented 6 years ago

Thanks a lot. I reimplement it not using any tensorpack framework, and it works. As your note, when I use TTDense in the last one , and accuracy is low, just 0.1. I want to compress the last one layer, for it's parameter is so much. Do you have any idea?

Bihaqo commented 6 years ago

Do you want to compress 512 x 10 matrix or in practice the matrix will be bigger?

cw-plus commented 6 years ago

In practice the matrix of the last one layer will be bigger. Weights matrix in previous layer can be compressed using Quantization weights .

Bihaqo commented 6 years ago

Actually, it is my long-standing hope to apply TTLayer to the embedding matrix of an NLP model. I never tried, but have this intuition that with random order of words in the dictionary it should not work.

On 30 March 2018 at 13:17, ChaoWangHS notifications@github.com wrote:

In practice the matrix of the last one layer will be bigger. Weights matrix in previous layer can be compressed using Quantization weights .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Bihaqo/t3f/issues/122#issuecomment-377495959, or mute the thread https://github.com/notifications/unsubscribe-auth/AFXq6a7fn4p6jjEGwMUr8oct0l8eqqk0ks5tjgY1gaJpZM4S-F99 .

wisdom0530 commented 6 years ago

Hi! we want to compress the last layer using TTlayer, what order of inputs/outputs should be appropriate? Now, we want to compress the last layer of resnet-50 using TTlayer. The matrix of the last layer of resnet-50 is 2048*1000, will the TT-layer work?

wisdom0530 commented 6 years ago

By the way, In your sentence: 'It may not be the problem inside the network, but if the TT layer maybe if the last one it can cause problems because of the order of labels.' what does the word 'order' mean? It means 'order of magnitude' , or 'sequence' of labels. thanks

ShHsLin commented 6 years ago

@XiaonanHuang It might be better to add empty classes and make the final layer to 2048*1024, then one could simply have [2, 2, ..., 2, 2] x [2,2, ..., 2, 1] for the matrix. If you want to stick to 2048 and 1000, then just probably [8, 8, 8, 4] x [2, 5, 5, 2].

The order of the labels matters in the way that the correlation/mutual information is carried through the intermediate bonds and the truncation might be larger if you put two correlated labels on two end of the "train".