SEPIA-Framework / sepia-docs

Documentation and Wiki for SEPIA. Please post your questions and bug-reports here in the issues section! Thank you :-)
https://sepia-framework.github.io/
236 stars 16 forks source link

expand offline capabilities #189

Open andsofine opened 2 years ago

andsofine commented 2 years ago

now there are no open source voice assistants that would work without additional steps. offline voice assistant would be very handy I think and not only for me

fquirin commented 2 years ago

Most features of SEPIA already work offline, like all the NLU, STT, TTS, smart-home etc.. Offline STT can't compete with Google, Apple, Microsoft etc. yet but it works pretty well for specific domains. I'm also currently working on more features for the SEPIA STT server to better adapt to multiple domains on the fly.

Do you have a specific feature in mind that doesn't work offline yet?

Btw when I say offline I mean on-device or inside your private network since you can either run all SEPIA servers on one device or on multiple devices that are connected.

andsofine commented 2 years ago

I mean without any network at all

fquirin commented 2 years ago

If you have a Raspberry Pi 4 with 4GB you can actually install SEPIA-Home, SEPIA-Client and SEPIA-STT Server next to each other removing all external connections, but you will be limited in certain points assuming you have no external network at all:

So it depends a little bit what you primary use-case is. In theory you could play music files stored on your device to replace streaming services. If you have a very powerful device you could even install complete Wikipedia and Openstreetmaps instances, but for weather and news you need to get real-time data from somewhere outside and for smart-home I think you should have at least access to your own network or it won't be much fun :sweat_smile: .

andsofine commented 2 years ago

If you have a Raspberry Pi 4 with 4GB you can actually install SEPIA-Home, SEPIA-Client and SEPIA-STT Server next to each other removing all external connections, but you will be limited in certain points assuming you have no external network at all:

  • You can't load news articles
  • You can't check Wikipedia for answers
  • You don't get any weather data
  • You can't stream radio or Youtube music
  • You can't control smart-home devices unless they are directly connected to the RPi
  • You can't use maps or location search
  • You can't communicate with other SEPIA-Clients (remote actions, chat, etc.)

So it depends a little bit what you primary use-case is. In theory you could play music files stored on your device to replace streaming services. If you have a very powerful device you could even install complete Wikipedia and Openstreetmaps instances, but for weather and news you need to get real-time data from somewhere outside and for smart-home I think you should have at least access to your own network or it won't be much fun 😅 .

I'm to blame for not mentioning that we are talking about the mobile version of the client. and I meant that the voice assistant will work without the Internet (and home server). this does not mean that he cannot have access to the Internet. it’s just that when there is no Internet access, he will still be able to execute commands

fquirin commented 2 years ago

Oh I see ^^. That is a pretty complicated task since the client has been developed from ground up to offload tasks to different SEPIA services, but with the right device it is possible to install SEPIA server AND client on one mobile phone. There are some ongoing experiments with Pinephone Pro for example which is running a full Linux. In theory this could be done on Android as well since Android is nothing else than Linux and Android app are even based on Java, same as the SEPIA server. Traditionally this has been complicated though because Android was stuck on Java 7 for a very long time and doesn't really give users the option to access the system like a "normal" Linux.

andsofine commented 2 years ago

It seems to me that keeping your own server is not the best solution for an android phone. Well, in principle, for a mobile phone. so I meant the development of the application without using the server. but now I realized that there is no point in overflowing an application that was intended to be used correctly along with the server. can I then ask you to think about creating a separate application (like saiy assistant which is abandoned)? Or maybe that's too arrogant?

fquirin commented 2 years ago

It is certainly possible since the client itself already has something like an "offline" mode that was implemented for testing and debugging. It currently fakes the server reply with hard-coded answers and a very very basic NLU, but it could be extended to create "real" actions. The problem is that everything the server handles needs to be replaced with completely new code for the client, at least the NLU and services part. That is a lot of code unfortunately :-/. Ultimately this would also limit a lot of features that are only possible with the server like user chats, remote-actions between clients, certain smart-home features and more.

To be honest I don't think an offline assistant has a lot of application in real-life. Most features usually only make sense when you can connect to something and if you do you usually have access to your server as well. If you are completely offline and you want to do more than "small-talk" with your assistant you would need an enormous database of knowledge and music right on your mobile device ... which is kind of equivalent to running the server on-device :thinking: Well that's my personal opinion at least.

It seems to me that keeping your own server is not the best solution for an android phone

It feels strange when you think about it, true, but in reality Android itself is running dozens of servers internally to offer all the features we use all day. So in reality its not that uncommon. After all the difference between a server and an app is just the communication layer ^^.

andsofine commented 2 years ago

I say that it could work without the Internet. I didn't mean that it doesn't have internet access (and I would use it with the internet, otherwise it's much less useful) should work offline: sst, tts, forwarding commands to basic applications (set an alarm, set a timer, set Karen's birthday on the fifteenth of August, etc.), small (or not small) things that can be integrated into the assistant app, and maybe i forgot something else

you tell assistant to play some music video on youtube and it will use its offline code to tell the youtube app to do that search. youtube app just says no connection error, but the task itself was completed. the youtube app here is just an example of how it might work, with this one I wouldn't use this approach. because I don’t know how everything is arranged on youtube, but I suppose that in order to immediately open the requested music video (or something), some kind of integration is needed. you need to send a message to some server that, at your request, will give you video link (or something like that). and I have nothing against it. it would be foolish to limit such things. because it is necessary for convenience, and an assistant is needed for this.

and if we talk about the server, it still seems to me that it will not be very good to affect at least autonomy (not to mention the rest)

fquirin commented 2 years ago

sst, tts, forwarding commands to basic applications (set an alarm, set a timer, set Karen's birthday on the fifteenth of August, etc.), small (or not small) things that can be integrated into the assistant app, and maybe i forgot something else

It would be a bit tricky to manage (maybe via a manual switch), but one could put the app in some kind of "serverless" mode where it can still run some very basic voice commands in the form of Android intents or URL-scheme calls :thinking: . The scope would be drastically limited but for users that actually know what's going on (most will probably be confused) it could be useful.

you tell assistant to play some music video on youtube and it will use its offline code to tell the youtube app to do that search. youtube app just says no connection error, but the task itself was completed. ... but I suppose that in order to immediately open the requested music video (or something), some kind of integration is needed

I think the basic Youtube URL-scheme actually works if the app calls the URL via the external browser instead of trying to embed the video, something like https://www.youtube.com/results?search_query=jimi+hendrix it just won't start autoplay I guess.

and if we talk about the server, it still seems to me that it will not be very good to affect at least autonomy (not to mention the rest)

There are drawbacks, no doubt, but overall I think it's still the better approach unless of cause you manage to build a system that does both things at once, running fully offline when it has to and sync everything with the server when it can. That would be the dream, but it is a lot more work and requires some drastic changes in architecture :-/.

andsofine commented 2 years ago

I'm sure it's just the opposite. server functions need to be configured, so what I'm suggesting is a more user-friendly mode. fully automated without user intervention (like google assistant but better. although google assistant may work offline but i'm not sure)

yes, it can be possible to build such functionality into an existing application. but this functionality will greatly increase the size of the application. so I say that it might be worth making a separate application. so that people using sepia to connect to the server do not suffer due to the increase in the size of the application

as i said earlier i think using "saiy assistant" work is a good idea and a good start.

fquirin commented 2 years ago

but this functionality will greatly increase the size of the application. so I say that it might be worth making a separate application. so that people using sepia to connect to the server do not suffer due to the increase in the size of the application

I thought about ways to include custom offline STT and TTS as optional downloads maybe. As you've mentioned it greatly increases size, at least 60 MB per language for the smallest default models. Another thing is that it won't run on every device and it eats a lot of RAM and battery, but it's certainly possible. I've not really followed up on this idea lately since Android supports offline STT for a while now (at least on some devices) and for open-source STT I'm currently implement a new version of the SEPIA STT that relies on swapping language models on the fly which means there are a lot more MB and the models probably have to be updated now and then.

To sum this up: I'd love to offer more offline features in the client but it is currently not something I can focus on. I will keep it in mind though and think about ways to build plugins that could work on server and client at the same time.