sony / model_optimization

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
https://sony.github.io/model_optimization/
Apache License 2.0
295 stars 49 forks source link

convert error in yolo8n SKU-110K #1056

Closed YoshikiKato0220 closed 3 months ago

YoshikiKato0220 commented 4 months ago

Issue Type

Bug

Source

source

MCT Version

2.0.0

OS Platform and Distribution

Ubuntu 18.04

Python version

3.10.14

Describe the issue

Hello.
I tried to convert yolov8n SKU-110K in MCT with tensorboard.
When I called mct.ptq.keras_post_training_quantization,
I got the following error,
Exception: The model cannot be quantized to meet the specified target resource utilization activation with the value 1638400.

I wonder if it would be possible to tell me how to address this error ?

Expected behaviour

I expect conversion succeed.

Code to reproduce the issue

docker image: ultralytics/ultralytics:latest

git clone https://github.com/sony/model_optimization.git local_mct
cd local_mct
git checkout refs/tags/v2.0.0
pip install -r requirements.txt

[base code]
https://github.com/sony/model_optimization/blob/v2.0.0/tutorials/notebooks/keras/ptq/example_keras_yolov8n.ipynb

[add] 
mct.set_log_folder('./loggerv2')

[modify]
 I think tensorboard_writer.py don't support tf.image.combined_non_max_suppression
model = Model(model.input, outputs, name='yolov8n')
↓
model = Model(model.input, model.output, name='yolov8n')

I used SKU-110K trained model (epochs = 300)
https://docs.ultralytics.com/datasets/detect/sku-110k/#dataset-yaml

I used this pt file and load like
https://github.com/sony/model_optimization/blob/v1.11.0/tutorials/notebooks/example_keras_yolov8n.ipynb

yolov8n.yaml's nc changed from 80 to 1.

Log output

CRITICAL:Model Compression Toolkit:The model cannot be quantized to meet the specified target resource utilization activation with the value 1638400.
Traceback (most recent call last):
  File "/usr/src/ultralytics/myModelConv_product_02_kato_v2_board.py", line 187, in <module>
    quant_model, _ = mct.ptq.keras_post_training_quantization(model,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/ptq/keras/quantization_facade.py", line 134, in keras_post_training_quantization
    tg, bit_widths_config, _ = core_runner(in_model=in_model,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/runner.py", line 119, in core_runner
    bit_widths_config = search_bit_width(tg,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/common/mixed_precision/mixed_precision_search_facade.py", line 126, in search_bit_width
    result_bit_cfg = search_method_fn(search_manager,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/common/mixed_precision/search_methods/linear_programming.py", line 64, in mp_integer_programming_search
    lp_problem = _formalize_problem(layer_to_indicator_vars_mapping,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/common/mixed_precision/search_methods/linear_programming.py", line 174, in _formalize_problem
    _add_set_of_ru_constraints(search_manager=search_manager,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/common/mixed_precision/search_methods/linear_programming.py", line 231, in _add_set_of_ru_constraints
    Logger.critical(
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/logger.py", line 117, in critical
    raise Exception(msg)
Exception: The model cannot be quantized to meet the specified target resource utilization activation with the value 1638400.
YoshikiKato0220 commented 4 months ago

This error doesn't occur in MCT v1.11.0. But in MCT v1.11.0, I have a #1055 error.

YoshikiKato0220 commented 4 months ago

linear_programing.py

ipdb> aggr_ru [8400.0, 33600.0, 8400.0, 33600.0, 8400.0, 8400.0, 8400.0, 8400.0, 204800.0, 819200.0, 8400.0, 8400.0, 33600.0, 25600.0, 6400.0, 1600.0, 25600.0, 204800.0, 819200.0, 1600.0, 6400.0, 25600.0, 102400.0, 204800.0, 409600.0, 204800.0, 819200.0, 25600.0, 204800.0, 819200.0, 153600.0, 51200.0, 51200.0, 102400.0, 153600.0, 51200.0, 102400.0, 307200.0, 102400.0, 204800.0, 204800.0, 307200.0, 102400.0, 204800.0, 614400.0, 204800.0, 409600.0, 409600.0, 1228800.0, 819200.0, 409600.0, 307200.0, 102400.0, 204800.0, 204800.0, 614400.0, 409600.0, 204800.0, 204800.0, 51200.0, 102400.0, 153600.0, 51200.0, 51200.0, 51200.0, 102400.0, 102400.0, 409600.0, 102400.0, 102400.0, 102400.0, 102400.0, 102400.0, 204800.0, 204800.0, 204800.0, 819200.0, 204800.0, 204800.0, 204800.0, 204800.0, 204800.0, 409600.0, 409600.0, 819200.0, 819200.0, 1228800.0, 409600.0, 409600.0, 819200.0, 819200.0, 1638400.0, 3276800.0, 1228800.0, 102400.0, 409600.0, 102400.0, 409600.0, 102400.0, 204800.0, 102400.0, 102400.0, 204800.0, 409600.0, 409600.0, 819200.0, 1638400.0, 400.0, 1600.0, 6400.0, 25600.0, 6400.0, 1600.0, 25600.0, 102400.0, 409600.0, 25600.0, 102400.0, 409600.0, 25600.0, 102400.0, 409600.0, 25600.0, 102400.0, 409600.0] ipdb> p v 3276800.0

ofirgo commented 3 months ago

Hi @YoshikiKato0220 ,

According to the error message you are getting, It seems that you are trying to run mixed precision quantization to quantize the model's activations to a specific target memory size. The tutorial that you are running is not set to run this type of quantization, so I will need to know whether you made any changes to the code using the MCT, in order to figure out what could be the problem.

If you did try to run activation mixed precision, it is possible that the problem is that you provide the MCT with a memory restriction that is too low, such that the maximal activation tensor memory size can not be reduced to the specified target. If this is the case, I would suggest to try and provide a looser restriction on the activation memory size.

Let us know if this is helpful and if you need any other help with this issue.

YoshikiKato0220 commented 3 months ago

Hi @ofirgo , Thanks for your comment, I only changed the code in terms of SKU-110K (not COCO).

Then I changed the code in accordance with your advise.

resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory 0.75) ↓ resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory 0.75, 3276800)

it worked. Thank you.

Could you tell me the effect when activation_memory is increased. Do the quantized model make less compression ratio or less accuracy ?

ofirgo commented 3 months ago

Hi @ofirgo , Thanks for your comment, I only changed the code in terms of SKU-110K (not COCO).

Then I changed the code in accordance with your advise.

resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory 0.75) ↓ resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory 0.75, 3276800)

it worked. Thank you.

Could you tell me the effect when activation_memory is increased. Do the quantized model make less compression ratio or less accuracy ?

I'm glad to hear this solves the issue for you. Regardless, I'm still not sure what caused the original issue, since you say you didn't change anything from the original tutorial code but the dataset. Do you know how to explain where the restriction of "1638400" activation memory came from in your original code that had the issue? Because, if you didn't provide it, then the expected behavior should be that the activation memory size is not restricted and you shouldn't have experienced the error.

What happened when you modified the resource_utilization call is that you added a restriction on the memory of the maximal activation tensor during model inference to 3276800. This is the size of the actual maximal activation tensor (according to your previous message), so this modification didn't affect the accuracy or the memory of the quantized model.

I'm keeping the issue open for now. I would appreciate it if you can provide more information regarding the way you originally called MCT (that causes the issue).

Thank you for raising this issue and for your help.

YoshikiKato0220 commented 3 months ago

Hi @ofirgo I'm sorry to have troubled you. my code was wrong.

I had set resource_utilization_data not resource_utilization in keras_post_training_quantization's 3rd argment.