google / lyra

A Very Low-Bitrate Codec for Speech Compression
Apache License 2.0
3.84k stars 355 forks source link

encoded file size, audio quality and evaluation #59

Closed shoham5 closed 3 years ago

shoham5 commented 3 years ago

Hi,

  1. I tried to encode the "Original" file from google ai Blog using the original sample rate 16k and also after changing the sample rate to 8k. The Original size at 16k sample-rate was 163.5kb and 81.8kb at 8k. when I encoded the files with lyra. Both files were 1.9kb. I'm wondering if this is just random? or lyra doesn't care about the source sample rate?

  2. When I decoded at a 16k rate the size of my lyra output was different from the decoded file at google ai Blog. of course the quality was different too. I would appreciate any explanation about that.

  3. How could I check lyra quality? I figured out that PESQ and POLQA will not give me the correct score. because the change in alignment and phase.

aluebs commented 3 years ago

Hi shoham5, Please see my answers below.

  1. You are right, Lyra does not care about the source sample rate, because it resamples the input to 16kHz regardless.
  2. Yes, the quality discrepancy with the blog post samples is because model used there was a newer generation. We are working on releasing a higher quality model here. For more details, please check out issue #25.
  3. Agreed that objective metrics like POLQA and PESQ are not well-suited to evaluate quality of generative models. We have been evaluating our models using listening tests like MOS or MUSHRA.

Hope that answers your questions. Please let me know if you have any follow-ups.

shoham5 commented 3 years ago

Thank you for the detailed answers.

Actually I have some questions

  1. I tried encoding an mp4 file like the one in Google blog. but Lyra can't handle this file type. "ABORTED: Failed to read from wav at path" Is there any plan to support this in the next generation?
  2. Is there an expected release date?
aluebs commented 3 years ago
  1. Currently our file tools only support reading and writing wav files. There is no plan to support any other formats to not bloat our implementation. You can always first convert the mp4 to wav using ffmpeg or any other tool.
  2. We don't have a set release date.