homebrewltd / ichigo

Llama3.1 learns to Listen
154 stars 5 forks source link

ci: Set up baseline evals using cascaded system #38

Open 0xSage opened 1 month ago

0xSage commented 1 month ago

Problem

Systematically evaluate the performance of our multimodal model by comparing it to a baseline benchmark. Baseline is a cascaded system of Whisperspeech TTS + LLaMA3.1.

Suggestions

tikikun commented 1 month ago

maybe bach have a look? otherwise for latency we have the result here #40