What is the principle behind getting 8 bytes per frame of audio?

I have reviewed the input and output of three tflite models, such as: soundstream_encoder.tflite： tensor: float32[1,320] => tensor: float32[1,1,64] quantizer.tflite encode： tensor: float32[1,1,64] => tensor: int32[46,1,1] quantizer.tflite decode： tensor: int32[46,1,1] => tensor: float32[1,1,64] lyragan.tflite： tensor: float32[1,1,64] => tensor: float32[1,320] I have reviewed the code in the residual-vector_quantizer.cc section, but I don't understand how int32 [46,1,1] is converted into 8-byte encoded data. Does anyone know the principle behind it?

google / lyra

What is the principle behind getting 8 bytes per frame of audio? #154