I have reviewed the input and output of three tflite models, such as:
soundstream_encoder.tflite:
tensor: float32[1,320] => tensor: float32[1,1,64]
quantizer.tflite encode:
tensor: float32[1,1,64] => tensor: int32[46,1,1]
quantizer.tflite decode:
tensor: int32[46,1,1] => tensor: float32[1,1,64]
lyragan.tflite:
tensor: float32[1,1,64] => tensor: float32[1,320]
I have reviewed the code in the residual-vector_quantizer.cc section, but I don't understand how int32 [46,1,1] is converted into 8-byte encoded data. Does anyone know the principle behind it?
I have reviewed the input and output of three tflite models, such as: soundstream_encoder.tflite: tensor: float32[1,320] => tensor: float32[1,1,64] quantizer.tflite encode: tensor: float32[1,1,64] => tensor: int32[46,1,1] quantizer.tflite decode: tensor: int32[46,1,1] => tensor: float32[1,1,64] lyragan.tflite: tensor: float32[1,1,64] => tensor: float32[1,320] I have reviewed the code in the residual-vector_quantizer.cc section, but I don't understand how int32 [46,1,1] is converted into 8-byte encoded data. Does anyone know the principle behind it?