X-LANCE / SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model
MIT License
576 stars 52 forks source link

Query on Metrics Reported in VSR Sub-Project Test Phase #95

Closed RookieJunChen closed 5 months ago

RookieJunChen commented 5 months ago

System Info

Excellent work! May I kindly inquire what metrics are reported in the test phase of the VSR sub-project? Is it WER?

Information

🐛 Describe the bug

Excellent work! May I kindly inquire what metrics are reported in the test phase of the VSR sub-project? Is it WER?

Error logs

Excellent work! May I kindly inquire what metrics are reported in the test phase of the VSR sub-project? Is it WER?

Expected behavior

More details.

ddlBoJack commented 5 months ago

Yes, it's WER metrics.

hanif-rt commented 3 months ago

Hey Guys, did you guys do any other experiments for VSR?

Seems quite promising that a simple linear projector finetuning + adapter instruction tuning does so well (29.4% compared to based 27.8% for avhubert large).

Have you guys tried any other approaches?