toverainc / willow

Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
https://heywillow.io/
Apache License 2.0
2.61k stars 96 forks source link

Design question #369

Closed sjkoelle closed 8 months ago

sjkoelle commented 8 months ago

We are trying to understand the design choice to rely on streaming HTTP and on device AEC over adapting a framework like WebRTC e.g. using libpeer. What was the reason for this approach and how did you evaluate the options? Apologies if this is too general of a question.

kristiankielhofner commented 8 months ago

WebRTC is completely overblown for Willow.

The WIS instance is always a known reachable address using more-or-less fixed parameters based on device support. There is no need for candidate evaluation (ICE), negotiation, port multiplexing, DTLS, STUN, TURN, etc. The distinct separation between signaling and media is also completely unnecessary and introduces additional complexity. With HTTP we also gain things like auth, universal compatibility with all kinds of middleware components (from nginx to CDNs), etc without having to introduce even more complexity with the way these things are managed with WebRTC.

HTTP (or similar) for transport of signaling is already a requirement. WebRTC is a ton of code, network issues, and on-device consumption of resources that only results in a worse experience.

That said WIS itself does support WebRTC for use with browsers (where it makes sense and works well).

sjkoelle commented 8 months ago

Thanks - that makes sense.