Open CRosero opened 4 years ago
My way to workaround this is to quantize both models separately and then combine them into normal Keras model.
q_base_model = quantize_model(base_model)
q_head_model = quantize_model(head_model)
inputs = Input(...)
h = q_base_model(inputs)
outputs = q_head_model(h)
full_model = Model(inputs, outputs)
full_model.compile(...)
full_model.fit(...)
I'm not sure if this is a correct approach but it works for me.
@alanchiao @nutsiepully Could you take a look? Thanks!
Hi @CRosero,
We haven't added support for quantizing Keras models within models yet. This is possible, and something we intend to do in the future.
In the meanwhile, @kmkolasinski is right. That's the approach you would have to use when using models recursively. Just quantize all the models you are interested in.
Thanks @kmkolasinski!
Thanks! @nutsiepully @kmkolasinski
Quantizing models recursively and combining models cannot make fully quantized model?
base_model = keras.Sequential([ keras.layers.InputLayer(input_shape=(28, 28)), keras.layers.Reshape(target_shape=(28, 28, 1)), keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu), keras.layers.MaxPooling2D(pool_size=(2, 2)), keras.layers.Flatten(), ])
head_model = keras.Sequential([ keras.layers.InputLayer(input_shape=(None, 2028)), keras.layers.Dense(10, activation=tf.nn.softmax) ])
quantize_model = tfmot.quantization.keras.quantize_model
q_base_model = quantize_model(base_model) q_head_model = quantize_model(head_model)
q_full_model = keras.Sequential([ q_base_model, q_head_model ])
q_full_model.compile... q_full_model.fit...
converter = tf.lite.TFLiteConverter.from_keras_model(q_full_model) converter.optimizations = [tf.lite.Optimize.DEFAULT] quantized_tflite_model = converter.convert()
When I tried to convert it, I got the error message :
ValueError("Unsupported tf.dtype {0}".format(tf_dtype))
q_full_model is not fully quantized?
Hi @Kyle719,
I tried reproducing this, but I didn't see any errors. It converted just fine.
Please make sure you use tf-nightly
. This should explain how the conversion is done.
Hey @nutsiepully thanks for the insight. Would you mind keeping this issue up to date with any changes in status/priority/roadmaps, etc.. regarding this capability moving forward? Thanks!
Will update it, once we add support for it.
@kmkolasinski Thanks for your suggestion. I am trying it out but unfortunately not getting it to work. My code looks similar to that of @Kyle719, but I'm already getting a ValueError on q_head_model = quantize_model(head_model)
, saying
model
must contain at least one layer which have been annotated withquantize_annotate*
. There are no layers to quantize.
If you go to versions saved, this is labeled as "Initial attempt" Even after adding that which is suggested in the error (version _"quantizeannotate change"), it doesn't go away and still stays there.
@nutsiepully and the others, do you happen to have any suggestions for a solution? (FYI I made the link so you can try the code and corresponding solutions out on the colab directly, hope that makes it easier)
@CRosero - I fixed the code in your colab. Your Sequential model has not been constructed correctly - it was missing parentheses. It does not actually have any layers. That's why it was failing.
Also, after quantize_annotate...
, you just have to use quantize_apply
not quantize_model
again. Though it still works here.
I understand the complexity of using a new API, but it's generally not feasible for me to debug user code.
Thank you very much @nutsiepully for your patience help! Didn't notice that at all...made the corresponding changes and now it's working :)
Thanks! @nutsiepully 'Transfer learning + QAT' is working well like the code below (I used VGG19 because it does not have the batch normalization layer which is not supported for qat yet)
I have one more question now! How can I follow the steps introduced in the tensorflow page? https://www.tensorflow.org/model_optimization/guide/quantization/training_example
The steps :
Is it possible to fine tune model with qat by quantizing models recursively ?
` base_model = tf.keras.applications.VGG19(input_shape=IMG_SHAPE, include_top=False, weights='imagenet')
head_model = tf.keras.Sequential([ tf.keras.layers.InputLayer(input_shape=(5, 5, 512)), tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(1) ])
import tensorflow_model_optimization as tfmot quantize_model = tfmot.quantization.keras.quantize_model
q_base_model = quantize_model(base_model) q_head_model = quantize_model(head_model)
original_inputs = tf.keras.Input(IMG_SHAPE) output1 = q_base_model(original_inputs) output2 = q_head_model(output1)
q_aware_model = tf.keras.Model(inputs=original_inputs, outputs=output2)
base_learning_rate = 0.0001 q_aware_model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=base_learning_rate), loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), metrics=['accuracy'])
initial_epochs = 1 validation_steps=20
history = q_aware_model.fit(train_batches, epochs=initial_epochs )
converter = tf.lite.TFLiteConverter.from_keras_model(q_aware_model) converter.optimizations = [tf.lite.Optimize.DEFAULT] quantized_tflitemodel = converter.convert() , quant_file = tempfile.mkstemp('.tflite') with open(quant_file, 'wb') as f: f.write(quantized_tflite_model) print("Quantized model in Mb:", os.path.getsize(quant_file) / float(2**20)) `
Hi @Xhark, can you comment if nested model is supported now?
We don't support fully recursively, but now you can apply quantize the model contains sub-model.
e.g) q_base_model = quantize_model(base_model)
original_inputs = tf.keras.Input(IMG_SHAPE) x = q_base_model(original_inputs) x = tf.keras.layers.GlobalAveragePooling2D()(x) output=tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs=original_inputs, outputs=output)
q_aware_model = quantize_model(q_base_model)
--
This example was not supported before, but it works now.
Thanks @Xhark.
Seems to me the last line q_aware_model = quantize_model(q_base_model)
is not needed. q_base_model
is already quantized, right?
q_base_model is already quantized, but last line is needed to quantize outside of the q_base_model. (GAP and Dense)
This works for me, may be it is useful for someone!
def create_quantization_model(model):
layers = []
for i in range(len(model.layers)):
if isinstance(model.layers[i], tf.keras.models.Model):
quant_sub_model = tf.keras.models.clone_model(model.layers[i], clone_function= apply_quantization)
layers.append(tfmot.quantization.keras.quantize_apply(quant_sub_model))
else:
layers.append(apply_quantization(model.layers[i]))
quant_model = tf.keras.models.Sequential(layers)
return quant_model
def apply_quantization(layer):
if isinstance(layer, tf.keras.layers.Dense):
return tfmot.quantization.keras.quantize_annotate_layer(layer)
return layer
Any tips on quantizing the Pix2Pix generator? I've used this official tutorial as a guide, and have attempted the following to no avail:
def downsample(filters, size, apply_batchnorm=True):
initializer = tf.random_normal_initializer(0., 0.02)
result = tf.keras.Sequential()
result.add(
tfmot.quantization.keras.quantize_annotate_layer(
tf.keras.layers.Conv2D(filters, size, strides=2, padding='same',
kernel_initializer=initializer, use_bias=False)
)
)
if apply_batchnorm:
result.add(
tfmot.quantization.keras.quantize_annotate_layer(
tf.keras.layers.BatchNormalization()
)
)
result.add(
tfmot.quantization.keras.quantize_annotate_layer(
tf.keras.layers.LeakyReLU()
)
)
return result
def upsample(filters, size, apply_dropout=False):
initializer = tf.random_normal_initializer(0., 0.02)
result = tf.keras.Sequential()
result.add(
tfmot.quantization.keras.quantize_annotate_layer(
tf.keras.layers.Conv2DTranspose(filters, size, strides=2,
padding='same',
kernel_initializer=initializer,
use_bias=False)
)
)
result.add(
tfmot.quantization.keras.quantize_annotate_layer(
tf.keras.layers.BatchNormalization()
)
)
if apply_dropout:
result.add(
tfmot.quantization.keras.quantize_annotate_layer(
tf.keras.layers.Dropout(0.5)
)
)
result.add(
tfmot.quantization.keras.quantize_annotate_layer(
tf.keras.layers.ReLU()
)
)
return result
def Generator():
inputs = tf.keras.layers.Input(shape=[512, 512, 3]) # Old: 256
down_stack = [
downsample(128, 4, apply_batchnorm=False), # (batch_size, 128, 128, 64)
downsample(256, 4), # (batch_size, 64, 64, 128)
downsample(512, 4), # (batch_size, 32, 32, 256)
downsample(1024, 4), # (batch_size, 16, 16, 512)
downsample(1024, 4), # (batch_size, 8, 8, 512)
downsample(1024, 4), # (batch_size, 4, 4, 512)
downsample(1024, 4), # (batch_size, 2, 2, 512)
downsample(1024, 4), # (batch_size, 1, 1, 512)
]
up_stack = [
upsample(1024, 4, apply_dropout=True), # (batch_size, 2, 2, 1024)
upsample(1024, 4, apply_dropout=True), # (batch_size, 4, 4, 1024)
upsample(1024, 4, apply_dropout=True), # (batch_size, 8, 8, 1024)
upsample(1024, 4), # (batch_size, 16, 16, 1024)
upsample(512, 4), # (batch_size, 32, 32, 512)
upsample(256, 4), # (batch_size, 64, 64, 256)
upsample(128, 4), # (batch_size, 128, 128, 128)
]
initializer = tf.random_normal_initializer(0., 0.02)
last = tf.keras.layers.Conv2DTranspose(OUTPUT_CHANNELS, 4,
strides=2,
padding='same',
kernel_initializer=initializer,
activation='tanh') # (batch_size, 256, 256, 3)
x = inputs
# Downsampling through the model
skips = []
for down in down_stack:
x = down(x)
skips.append(x)
skips = reversed(skips[:-1])
# Upsampling and establishing the skip connections
for up, skip in zip(up_stack, skips):
x = up(x)
x = tf.keras.layers.Concatenate()([x, skip])
x = last(x)
# Model
model = tf.keras.Model(inputs=inputs, outputs=x)
# Quantize
q_model = tfmot.quantization.keras.quantize_apply(model)
return q_model
This current setup gives me the error "_ValueError: model must contain at least one layer which have been annotated with quantizeannotate*. There are no layers to quantize." Then when I quantize_apply
on the Sequential models in the up/downsample functions the error changes to "_ValueError: model must be a built model. been built yet. Please call model.build(inputshape) before quantizing your model" (which makes sense). Is it possible to quantize with this model structure? Thanks in advance!
Can you try creating your model without any quantization first? Then call: q_model = tf.keras.models.clone_model(model, clone_function= apply_quantization)
, where apply_quantization
should annotate every layer you want quantize with tfmot.quantization.keras.quantize_annotate_layer
.
Thanks for the quick response! That doesn't throw an error, but it doesn't look like it quantizes the layers created within the upsample
and downsample
functions. Is there any way to also get those layers?
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) [(None, 512, 512, 3 0 []
)]
sequential_34 (Sequential) (None, 256, 256, 12 6144 ['input_3[0][0]']
8)
sequential_35 (Sequential) (None, 128, 128, 25 525312 ['sequential_34[1][0]']
6)
sequential_36 (Sequential) (None, 64, 64, 512) 2099200 ['sequential_35[1][0]']
sequential_37 (Sequential) (None, 32, 32, 1024 8392704 ['sequential_36[1][0]']
)
sequential_38 (Sequential) (None, 16, 16, 1024 16781312 ['sequential_37[1][0]']
)
sequential_39 (Sequential) (None, 8, 8, 1024) 16781312 ['sequential_38[1][0]']
sequential_40 (Sequential) (None, 4, 4, 1024) 16781312 ['sequential_39[1][0]']
sequential_41 (Sequential) (None, 2, 2, 1024) 16781312 ['sequential_40[1][0]']
sequential_42 (Sequential) (None, 4, 4, 1024) 16781312 ['sequential_41[1][0]']
concatenate_14 (Concatenate) (None, 4, 4, 2048) 0 ['sequential_42[1][0]',
'sequential_40[1][0]']
sequential_43 (Sequential) (None, 8, 8, 1024) 33558528 ['concatenate_14[1][0]']
concatenate_15 (Concatenate) (None, 8, 8, 2048) 0 ['sequential_43[1][0]',
'sequential_39[1][0]']
sequential_44 (Sequential) (None, 16, 16, 1024 33558528 ['concatenate_15[1][0]']
)
concatenate_16 (Concatenate) (None, 16, 16, 2048 0 ['sequential_44[1][0]',
) 'sequential_38[1][0]']
sequential_45 (Sequential) (None, 32, 32, 1024 33558528 ['concatenate_16[1][0]']
)
concatenate_17 (Concatenate) (None, 32, 32, 2048 0 ['sequential_45[1][0]',
) 'sequential_37[1][0]']
sequential_46 (Sequential) (None, 64, 64, 512) 16779264 ['concatenate_17[1][0]']
concatenate_18 (Concatenate) (None, 64, 64, 1024 0 ['sequential_46[1][0]',
) 'sequential_36[1][0]']
sequential_47 (Sequential) (None, 128, 128, 25 4195328 ['concatenate_18[1][0]']
6)
concatenate_19 (Concatenate) (None, 128, 128, 51 0 ['sequential_47[1][0]',
2) 'sequential_35[1][0]']
sequential_48 (Sequential) (None, 256, 256, 12 1049088 ['concatenate_19[1][0]']
8)
concatenate_20 (Concatenate) (None, 256, 256, 25 0 ['sequential_48[1][0]',
6) 'sequential_34[1][0]']
quantize_annotate_28 (Quantize (None, 512, 512, 3) 12291 ['concatenate_20[1][0]']
Annotate)
==================================================================================================
Total params: 217,641,475
Trainable params: 217,619,715
Non-trainable params: 21,760
__________________________________________________________________________________________________
I think quantization does not really go recursively for models that contains other models (in your case main model contains other sequential models). Did you try passing your model to create_quantization_model(model)
function mentioned here? I think the solution would be to iterate over model layers, if you encounter sequential model then iterate over its layers too , to annotate it.
I did manage to get that working for me with some additional layers in the _applyquantization function (I'm still learning here!). But, I receive the following error, I think due to the Concatenation layers between the sub-models:
ValueError: A merge layer should be called on a list of inputs. Received: inputs=Tensor("Placeholder:0", shape=(None, 4, 4, 1024), dtype=float32) (not a list of tensors)
I've also tested changing quant_model = tf.keras.Sequential(layers)
to quant_model = tf.keras.Model(layers)
in _applyquantization and it runs without issue. However, then when I call and attempt to view the new quantized model's summary like this q_model(inputs=inputs)
, I receive this error:
Unimplemented `tf.keras.Model.call()`: if you intend to create a `Model` with the Functional API, please provide `inputs` and `outputs` arguments. Otherwise, subclass `Model` with an overridden `call()` method.
Thanks again for your help.
Great. Check this to implement call
: https://keras.io/guides/customizing_what_happens_in_fit/
There is GAN example at the end of the page, that would be useful.
Wonderful! Thanks again!
additional
HI @frytoli, what changes did you make in apply quantisation function to apply quantisation to submodules (conv layers) of upsample and downsample blocks too? I am also facing a similar issue in my problem
Describe the bug I'm doing transfer learning and would like to (at the end) quantize my model. The problem is that when I try to use the _quantizemodel() function (which is used successfully in numerous tutorials and videos), I get an error. How am I supposed to do quantization for transfer learning (using an already previously built model as a feature extractor)?
System information
TensorFlow installed from (source or binary): pip
TensorFlow version: tf-nightly 2.2.0
TensorFlow Model Optimization version: 0.3.0
Python version: 3.7.7
Describe the expected behavior I expect the model to be successfully quantized and for no error messages to appear.
Describe the current behavior I get the error: "ValueError: Quantizing a tf.keras Model inside another tf.keras Model is not supported."
Code to reproduce the issue Can be found here