[Good First Issue][NNCF]: Dump actual_subset_size to ov.Model

openvinotoolkit / nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

Apache License 2.0

863 stars 216 forks source link

[Good First Issue][NNCF]: Dump actual_subset_size to ov.Model #2562

Open l-bat opened 4 months ago

l-bat commented 4 months ago

Context

After applying quantization to the ov.Model in Neural Network Compression Framework (NNCF), the quantization parameters, including subset_size, are dumped to the meta section of the OpenVINO IR. subset_size represents the size of the dataset used for calibration. https://github.com/openvinotoolkit/nncf/blob/09960b9b71a58f3277d0964531f8ee08365d7c72/nncf/openvino/quantization/quantize_model.py#L102

However, inconsistencies arise when the dataset size is less than the provided or default 'subset_size'. To address this confusion, it is proposed to also dump the actual_subset_size, which denotes the number of data samples used to calculate activation statistics. This addition will improve clarity and accuracy in managing quantization parameters and assist in reproducing quantization results.

What needs to be done?

Dump actual_subset_size parameter to ov.Model meta section.
Add tests

Example Pull Requests

No response

Resources

Contribution guide - start here!

Contact points

@l-bat

Ticket

No response

AiGaf1 commented 4 months ago

.take

github-actions[bot] commented 4 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

andrey-churkin commented 4 months ago

@AiGaf1 Hi, are you still working on this task? Do you need any help? Please inform us if you do not plan to continue working on this task. Thanks!

RitikaxShakya commented 3 months ago

Hello! is there any update on this issue? If not i wish to work on this issue.

p-wysocki commented 3 months ago

@l-bat could you please reassign the issue to @RitikaxShakya? I lack the permissions for NNCF repository.

RitikaxShakya commented 3 months ago

.take

github-actions[bot] commented 3 months ago

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

p-wysocki commented 2 months ago

Hello @RitikaxShakya, are you still working on that issue? Do you need any help?

awayzjj commented 3 weeks ago

.take

github-actions[bot] commented 3 weeks ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

awayzjj commented 3 weeks ago

Hi @l-bat I created a PR after testing locally, and the XML output is as expected:

I ran the pytest in tests/openvino, and the original tests did not break. But I have 2 questions

which file I should edit to add my unit test.
how to implement the unit test, should I check the output XML to verify whether the actual_subset_size property exists?

Thank you very much!

l-bat commented 3 weeks ago

Hi @awayzjj! Thanks for your contribution!

calibration_dataset.get_length() returns the size of the dataset that was provided to the nncf.quantize method, however actual_subset_size should show the number of data samples that were used to calculate the activation statistics. In the case of calibration_dataset.get_length() >= subset_size, actual_subset_size is equal to subset_size. Otherwise, actual_subset_size must be equal to calibration_dataset.get_length(). But it is not possible to use the get_length() method if __len__() is not implemented. Please take a look at https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/nncf/common/tensor_statistics/aggregator.py#L50-L53. You can implement the get_actual_subset_size() function.

l-bat commented 3 weeks ago

I ran the pytest in tests/openvino, and the original tests did not break. But I have 2 questions

which file I should edit to add my unit test.

how to implement the unit test, should I check the output XML to verify whether the actual_subset_size property exists?

You can add test to https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/tests/openvino/native/quantization/test_quantization_pipeline.py
You can use the test as an example https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/tests/openvino/native/quantization/test_quantization_pipeline.py#L178-L199