issues
search
janhq
/
ichigo
Local realtime voice AI
Apache License 2.0
1.99k
stars
99
forks
source link
planning: Ichigo Encoder
#92
Open
dan-homebrew
opened
1 month ago
dan-homebrew
commented
1 month ago
Goal
Paralinguistic emotions: emotions, i.e. acoustic encoder
We are using a semantic encoder i.e. WhisperVQ, so acoustic tokens are lost
Parallel tokens, WaveLM
Need to have a target
hahuyhoang411
commented
1 month ago
Good papers for this topic:
https://arxiv.org/abs/2402.05755
https://arxiv.org/abs/2402.12226
Goal