Open Faken93 opened 4 years ago
Removing bug label as this code is not integrated yet.
@Faken93 , could you please post a description of your model (mainly dimensions and configuration size). This issue happens when one of the matrices in the model does not have evenly divisible rows with the register stride we are using. We are working to fix this in the intgemm backend but do not have it yet. Until then, if possible retrain your model
To clarify intgemm currently expects parameter matrices to be a multiple of 64 (inner dimension) x 8 (outputs). Retraining alone will not help. Configuring a multiple of that will. But as stated I'm doing a rewrite that will support this.
Hi, Now that the PR #595 is merged, could you please provide a documentation on this? Thanks!
@santhoshtr apologies for the two month late response. We have two types of quantisation schemes, intgemm and fbgemm, both 8 and 16 bit. fbgemm is limited to avx2 and avx512, whereas intgemm supports older hardware as well. The results vary from hardware to hardware.
You can see how to use it in the benchmark section: https://github.com/marian-nmt/marian-benchmarks/tree/master/benchmarks/translation_wngt20
You can also use an alternative version of intgemm that is used in the Bergamot branch, which is faster: https://github.com/browsermt/students/tree/master/train-student#5-optional-8bit-quantization
Please let me know if there is anything unclear.
Bug description
When I use marian-server command, something ran incorrectly. The message is as following:
marian-server: /home/work/marian-dev/src/3rd_party/intgemm/avx512_gemm.h:307: static void intgemm::AVX512_8bit::PrepareBQuantizedTransposed(const int8_t, int8_t, intgemm::Index, intgemm::Index): Assertion `rows % kColStride == 0' failed.
How to figure out this problem? @XapaJIaMnu @kpu @ykim362 @emjotde Thanks!
Background
I use marian-conv command to convert model first. marian-conv -f base.npz -t int8.bin -g intgemm8
Context