Regression tests for model quantization

marian-nmt / marian-regression-tests

Regression tests for the marian-dev repository

Other

4 stars 9 forks source link

Regression tests for model quantization #69

Closed afaji closed 3 years ago

afaji commented 3 years ago

Regression tests for model quantization

snukky commented 3 years ago

I made minor fixes and clean up. I had to update the expected output for the test with optimization, but it still fails with "Tensor has more than 256 unique values" on two different machines with different GPUs, including the machine which Jenkins uses. I updated the output because scores are consistent on both machines. @afaji Could you take a look? Please run make clean before testing to make sure we work with the same vocabs.

snukky commented 3 years ago

The current status is that quantization with --quantization-steps produces different costs on zisa vs gna + internal machines. This is independent from compilation options (checked), but all 3 machines have different GPUs (different generations, gna has the oldest). Moreover, the 8-bit quantized model has correctly 256 unique values on zisa, but on other machines it has more, which needs to be investigated.

afaji commented 3 years ago

update, only the decoder's embedding has more than 256 unique values and there is no issue if tied-embeddings-all is used. Will investigate further...

Edit: I apparently need to allocate larger tensor

afaji commented 3 years ago

added 2 more tests:

4bits log-based quantization
test to make sure that the resulting model is in quantized format.

The working branch: https://github.com/afaji/Marian/tree/quant-alloc-fix