it seems that the model in tflite file differs from the model description in 'soundstream' paper although both are basically 'conv encoder - residual vq - conv decoder'.
is there a description of the model structure for the released version of Lyra 1.3.2?
it seems that the model in tflite file differs from the model description in 'soundstream' paper although both are basically 'conv encoder - residual vq - conv decoder'.
is there a description of the model structure for the released version of Lyra 1.3.2?