googlesamples / assistant-sdk-python

Samples and bindings for the Google Assistant API
http://developer.google.com/assistant/sdk
Apache License 2.0
916 stars 321 forks source link

google-assistant-library deprecation and alternatives #356

Closed blacklight closed 4 years ago

blacklight commented 5 years ago

This issue is a follow-up of https://github.com/googlesamples/assistant-sdk-python/issues/355.

The google-assistant-library has been reportedly deprecated on the 28th of June, and that means that it might not work on newer ARM platforms such as the new Raspberry Pi 4.

I have managed with quite some hammering on the pushtotalk.py sample to adapt it to my purposes, but I've got a couple of questions:

shivasiddharth commented 5 years ago

It is not the board that is the issue here. From my guess the glibc that raspberry pi foundation has packed with Buster is causing the issue. Reverting to an older version of glibc could potentially solve the issue. Needs some more work to exactly point out the cause though.

blacklight commented 5 years ago

That might well be the case indeed. RPi3:

$ /usr/lib/libc.so.6 --version
GNU C Library (GNU libc) stable release version 2.28.

RPi4:

$ /lib/arm-linux-gnueabihf/libc.so.6 --version
GNU C Library (Debian GLIBC 2.28-10+rpi1) stable release version 2.28.

If that's the case then rebuilding the libassistant against the new glibc could fix the issue, but given the library's deprecated state I'm no tsure if Google will be keen to release a new build.

Downgrading glibc could indeed solve the issue but it's definitely not a long-term viable option.

jeronimojoe commented 5 years ago

Thanks for your extensive input, BlackLight. Fwiw, my original issue #351 was on a 3B+. The library code there is broken, it broke around the time the deprecation notice went up, it was working before, I drew a cause-effect from a correlation? Anyway I have now retooled it with the pushtotalk.py as a starting point, quite a hack; but I am reluctant to commit further resources in case this too is suddenly deprecated, as you point out. I am new to working with Google APIs; is this the norm?

shivasiddharth commented 5 years ago

@jeronimojoe , If its just 3B+, then use Raspbian stretch. You can have the hotword running. This came out of the blue. Since Google also has its AIY project which relies heavily on the assistant library, God knows whats happening either they will release a improved version or scrap the AIY thing all together. But this is not new. AMazon deprecated their avs-sample-app but replaced that with a SDK version.

jeronimojoe commented 5 years ago

Thanks for your input, but I am running Stretch! However, it is a bit of a bastardised project; I never had the voice kit, so was hacking it together with google.assistant.library and aiy.assistant.voice and aiy.assistant .auth_helpers (from the November AIY 2018 release here I was then extending the app to respond to my own commands if certain hotwords were recognised, eg 'play' would play music or a video etc. Initially it worked well; then occasional seg faults started. I noticed that it was commands which might be already used by Assistant in other instantiations which tended to cause seg faults, notable 'play' and 'stop'. As I was experimenting with other commands ('run', 'cease', 'desist'(!)), the seg faults started to occur on every connection to assistant. This was around the date of the deprecation.

I went right back to basics, reinstalled Stretch, installed the basic packages again, just basic Assistant Library plus the AIY bits mentioned above. Seg faults every time. Figured it must be the deprecation.

Anyway, this is all pretty much FYI; I have abandoned the deprecated Library, and am doing it all with Service + Snowboy now. Was having trouble sharing sound resources between the two, but have it sorted now (in a not particularly elegant fashion, so if anyone has any tips here, would love to here them!)

Many thanks!

shivasiddharth commented 5 years ago

@jeronimojoe Here is a sample of snowboy integration for the pushtotalk https://github.com/shivasiddharth/GassistPi/blob/master/src/pushbutton.py hope it helps.

PS: Library is running with stretch and 3B+

blacklight commented 5 years ago

@jeronimojoe there's no elegant solution to embed the extremely hackish pushtotalk.py script into a project. I've gone through that script extensively: it's not a library, it's not an API, it's not a quickstart example, it feels like it's been put together in one single Friday by a developer eager to commit it and then walk to the company's drinks. It's something light-years away from the level of quality we are entitled to expect from companies like Google when it comes to providing clear documentation, clear examples and a smooth quickstart.

I have modified the sample here to also support custom callbacks (on_conversation_start, on_conversation_end, on_speech_recognized, on_response and on_volume_changed), otherwise it's pretty much useless - it doesn't keep track of the state and it just prints things out in the over-bloated assist() method with no support for hooks. Sample interaction:

with SampleAssistant(language_code=language,
                             device_model_id=self.device_model_id,
                             device_id=self.device_id,
                             conversation_stream=self.conversation_stream,
                             display=None,
                             channel=self.grpc_channel,
                             deadline_sec=self.grpc_deadline,
                             device_handler=self.on_device_action(),
                             on_conversation_start=self.on_conversation_start(),
                             on_conversation_end=self.on_conversation_end(),
                             on_volume_changed=self.on_volume_changed(),
                             on_response=self.on_response(),
                             on_speech_recognized=self.on_speech_recognized()) as self.assistant:
            continue_conversation = True

            while continue_conversation:
                    continue_conversation = self.assistant.assist()
                    ...

However it still requires to initialize the audio_helpers and device_helpers in the client code, but at least it clearly separates the assistant logic from the audio device logic. A full working example is here. However it still feels far from perfect to me, but I'm not sure that I want to invest more time in polishing the API, unless the gentlemen at Google are keen enough to share with us developers building things on top of their products what's their plan with the future of the assistant.

p.s. I've also got a bunch of RPi3 and 3B+ around that still run code based on the assistant library and it still works fine. I expect the library to break on devices with different glibc versions if libassistant_*.so wasn't compiled against them, but it shouldn't segfault on the RPi3, unless there's something else broken. However, even if the assistant library works, I probably wouldn't rely on a deprecated library in my own project.

proppy commented 5 years ago

@BlackLight Thanks for the detailed feedback, I tried to capture the different part of it in #358 and #359 and #357, let me know if I missed anything.

blacklight commented 5 years ago

@proppy thanks for tackling this :+1:

How about hotword detection, the support for timers, alarms and news and the other features that were part of the library? Will they be migrated to the gRPC service or will they be dropped?

proppy commented 5 years ago

@BlackLight Let's discuss hotwording in #357

For timers, alarms and news and other features request, those would require for those functionality to be exposed in the Google Assistant Service first before adding corresponding bindings and higher level event in the Python SDK. Filed #360 so developer can have a way to escalate those to the product team in charge of maintaining the service.

BenjaminFaal commented 5 years ago

@BlackLight i made this: https://github.com/BenjaminFaal/google-assistant-library, it works on my RPi4

blacklight commented 5 years ago

@BenjaminFaal I see that you're copying the libassistant_embedder.so from the library release package (I assume it's the armv7 version also for RPi4). That's more or less what I've got already in place but it's segfaulting for me. Are you using Raspbian or another distro? Could you please post the output of ldd libassistant_embedder.so to see if it's linked against different libraries on your system? (It's just out of curiosity to understand the real reason of the segfault, I won't be using the library again in my project given its deprecated status).

@proppy +1 for the generator approach in the Python assistant API, but I would also consider the support for custom event callbacks/hooks. The pushtotalk example already applies a similar pattern with the device_handler option (even though I haven't yet managed to send device events nor I've found sufficient documentation on how it's supposed to be used). Wouldn't it be a good idea to maintain the same pattern also for generic assistant events? Ideally, I see two main usage patterns for the assistant: a synchronous one, where a generator approach works the best:

with Assistant(**opts) as assistant:
    for event in assistant:
        print(event)

and an asynchronous one, where the handlers might be executed in other threads, and in this kind of approach the callback pattern can really help keeping the code clean:

with Assistant(on_conversation_start=on_conversation_start, ..., **opts) as assistant:
    # Do other stuff
BenjaminFaal commented 5 years ago

@BlackLight for RPi4 its indeed the armv7 .so file, i got it to work on the https://downloads.raspberrypi.org/raspbian/images/raspbian-2019-06-24 Raspbian Buster image with no modifications other than installing Node.js. I am not at home right now so i dont have access to RPi4 to give you the output of ldd libassistant_embedder.so. I also dont understand why it segfaults when used with Python, but i just made the Node.js version as an experiment to see if it would work on the RPi4 and indeed it did.

Just noticed that there is a newer version of Raspbian Buster maybe that will fix the segfault: https://downloads.raspberrypi.org/raspbian/images/raspbian-2019-07-12

proppy commented 5 years ago

@BlackLight device_handler is related to device actions, those are documented here for smarthome trait (OnOff, StartStop) and there for custom device action (i.e: your own query pattern).

The pushtotalk itself implements both, see: https://github.com/googlesamples/assistant-sdk-python/blob/master/google-assistant-sdk/googlesamples/assistant/grpc/pushtotalk.py#L421 and: https://github.com/googlesamples/assistant-sdk-python/blob/master/google-assistant-sdk/googlesamples/assistant/grpc/pushtotalk.py#L428 (with the association actions package defining the custom grammar for those actions).

Sorry those were not more visible in the repo, I filed #361 to discuss how to address this.

Let's discuss the event API in #358, I personally prefer the generator based approach but I updated the issue title and description to be more generic so that we can discuss alternative design in the comments.

shivasiddharth commented 5 years ago

Older version of the assistant seems to work even with buster. Library version 0.1.1 and Samples version 0.4.4

BishalLamsal123 commented 5 years ago

@shivasiddharth i am following you and your videos in youtube form last 5 months. I am trying make robot for my school project.I made google assistant using your gassistpi but it is not working right now i have tried by writing raspbian os more than 20 times but it is not working. Please Help me in few months i have to submit my project.I am from middle class family.I have sold my cycle and other gagets to make this project. And now it is saying error.Please Provide some solution.Please upload how to install google assistant service and disable push to talk method and enable snowboy hedless method in your blog or website.I am only of 13 years old so i dont know how to execute sample of snowboy push to talk also.Please help me.

blacklight commented 5 years ago

@BishalLamsal123 please take a look at my Reddit post where I've published a proof of concept on how to get Snowboy+pushtotalk to work together.

Note however that this is a hackish workaround while the Assistant guys figure out how to fix the mess they have created with the sudden deprecation of the Assistant library and the consequent negligence about uploading a version of the library that also works with Raspbian's latest libc.

@shivasiddharth @proppy is there any timeline for a new pushtotalk API/sample? If you're open to PRs then I could prepare one, as I've already adapted the script to my purposes and written a new wrapper, as long as we agree on the interface (decorator vs. inheritance vs. event callbacks I guess?).

proppy commented 5 years ago

@BlackLight sorry for the late reply (I was OOO). I'd like us to fix #359 first, to resolve the ambiguity about the current state of the google-assistant-library as you pointed out in your initial comment https://github.com/googlesamples/assistant-sdk-python/issues/356#issue-467311065

blacklight commented 5 years ago

@proppy sure, that has indeed higher priority - you can't really deprecate something unless you clearly communicate its deprecation :)

Let me know when you'd like to start tackling the new programming interface for the assistant, I'd be interested in bringing my 2 cents in.

acidfreako commented 4 years ago

hi guys i am on rasp pi 4 and followed down the rabbit hole to this post. what is my next best option without having to install new raspbian or a hacky push to talk solution ?

blacklight commented 4 years ago

Hi @acidfreako, I've managed a while ago to hammer the gRPC examples + Snowboy for hotword detection and make it work in my project, but since I see many users lost after Google's sudden deprecation of the assistant library I've decided to put together a repository that wraps the pushtotalk.py functionalities into something more usable and configurable and implements hotword detection on top of it through Snowboy (the gRPC service won't come with hotword support unfortunately).

If you use my repo assistant interaction should be as simple as this:

import sys

from assistant import Assistant

assistant = Assistant()

while True:
    print('Press ENTER to start a conversation, Ctrl+C to terminate')

    try:
        sys.stdin.readline()
    except KeyboardInterrupt:
        print('\nExiting the assistant')
        break

    interactions = assistant.start_conversation()
    print(interactions)
    assistant.stop_conversation()

And hotword support + assistant would look like this:

from assistant import Assistant
from assistant.hotword import HotwordService

assistant = Assistant()
service = HotwordService(model='/path/to/your/model/file', assistant=assistant)
service.start()
service.join()

Feel free to use the code and extend it, but please don't blame me too much for missing functionalities or bugs on Google's side :) this is only intended to be a workaround while we wait for the guys at Google to get their shit together and provide a new interface after deprecating the library overnight.

acidfreako commented 4 years ago

Thanks @BlackLight if i reallly need it i i will use your workaround. I will continue on with my other parts of the personal project.

proppy commented 4 years ago

@acidfreako @BlackLight I published an updated version of google-assistant-library to test PyPi that add deprecation notice to clear up the situation, see https://github.com/googlesamples/assistant-sdk-python/issues/359#issuecomment-530801818

I only tested it on x86_64 for now, once tested on Raspberry Pi 0/3/4 I will publish it to the official PyPi repo, that should clear up the situation on the library deprecation and enable further update to be made to the gRPC wrapper.

acidfreako commented 4 years ago

@proppy thanks for reply. does that mean gRPC will get an update for hotward detection ?

proppy commented 4 years ago

@acidfreako The current scope of the gRPC bindings is to expose the Google Assistant Service surface to Python, so we're unlikely to add google-assistant-library-like hotwording to the google-assistant-grpc package.

But it would be nice to document how to interface with 3rd party hotwording in the form of samples, I filed https://github.com/googlesamples/assistant-sdk-python/issues/357 for tflite micro but it would be also good to link (or have official sample) for other popular OSS hotwording package (like the snowboy one shared by @BlackLight).

acla6525 commented 4 years ago

Just found this, thanks so much @BlackLight!

blacklight commented 4 years ago

@proppy my workaround with Snowboy does its job relatively well, even though a voice model trained with around 300-500 samples will obviously never be as good as the native OK Google hotword voice model.

As power users/developers I feel however that we're entitled to a few answers. Many of us have packed the assistant library into other open source projects, and before investing time and resources in implementing workarounds or picking alternatives we'd like Google to provide us with a few official answers:

  1. Is this situation temporary or is it the new normal? In other words, is Google keen to provide a new open interface for hotword detection, timers, alarms etc., or is the deprecation also functionality-wise something irreversible and Google has decided to leave us stranded for good?

  2. If a new interface is on its way then what's the expected ETA? (so we know whether it's worth to wait for it or implement workarounds in our projects)

  3. If no new interface is on its way then when is Google going to provide a decent updated example to replace the pushtotalk.py hackish script, and how should the owners of the few hundreds projects that rely on the library handle the migration?

proppy commented 4 years ago

@BlackLight Thanks for the follow-up an attempt at replying to your questions below:

Let me know if this stir the discussion in the right direction.

blacklight commented 4 years ago

@proppy thanks for the extensive answer.

  1. The idea of integrating with micro_speech, Snowboy and similar services is amazing, but the problem remains the same: both provide the bare-bone infrastructure for training voice models, but a model trained on a few hundreds samples can't compete with the OK Google hotword model, that has been trained (I believe :) ) on millions of samples. I've noticed it myself already: even the most accurate Snowboy models have way more false positives and false negatives than the OK Google model. So I would rephrase my question as: is Google keen to provide us with their own OK Google voice model in a format compatible with any of the frameworks discussed above, or will hotword detection be completely delegated to external parties - and the training of the model to the user? In the latter case the loss in terms of accuracy will certainly be noticeable.

  2. Device actions and traits can indeed replace timers and alarms (and if done properly they would provide a unique entry point to manage actions), but we probably need a better interface and better documentation for them. Also, timers and alarms were previously managed natively by the library, while now they have to be explicitly managed by the developer: I'm wondering if this is really needed.

  3. I see the value of providing a low level interface, but in my opinion it shouldn't be so low level to expose to the user all the gRPC, audio device and API endpoint internals, and end up with a 450 lines of code quickstart example. Low level interface shouldn't mean badly designed interface :)

proppy commented 4 years ago
  1. Maybe Speech Commands would provide a better start point for the hotwording model, it's an open dataset that was created using crowdsourced recording of 100k+ samples on https://aiyprojects.withgoogle.com/open_speech_recording.
  2. Volume is another inconsistency example of something that could also be handled with a Device Action but that's currently managed using a dedicated field in the gRPC API https://github.com/googleapis/googleapis/blob/master/google/assistant/embedded/v1alpha2/embedded_assistant.proto#L257, if those were all exposed using Device actions it would be easier to handle those consistently in upper level layer (like the one discussed in https://github.com/googlesamples/assistant-sdk-python/issues/358).
  3. Part of the complexity, comes from the fact that the API itself it pretty complicated. It provides a single generic bi-directly streaming method (rpc Assist(stream AssistRequest) returns (stream AssistResponse)) with nested message union-like types (depending if you're handling text/audio, events). It might be worth exploring simplifying the API by provided simpler dedicated methods for handling text and event separately from the audio streaming endpoint so that every languages (not just Python) could benefit for the reduced complexity.
blacklight commented 4 years ago

@proppy is there any progress/ETA with replacing the features of the discontinued library? The new Mozilla DeepSpeech engine has quite some potential to cover many of the features orphaned by the deprecation of the assistant library, and it'd allow me to have better hotword detection (through a Tensorflow Lite model trained with thousands of samples) than a sh**ty Snowboy model trained with just 300 samples.

However, it'd also require me a lot of work to write a new implementation of the assistant class, and after working in the past year at integrating first with the Assistant Library, then with Push-to-talk, then Snowboy, then with three different versions of the Alexa SDK, I'd rather not waste any more of my time: if I have to write one more assistant integration, then that'd better be the ultimate one.

Please let me know if anything is happening or is planned to happen on Google's side. If not, it'll take me a lot of work but I'll just move away from Google altogether for building my DIY assistant.

And, if nothing is planned, it'd be great to at least have a build of the assistant library for RPi4 and the libc version installed there - deprecation with no alternatives and no plan is never a good thing.

proppy commented 4 years ago

And, if nothing is planned, it'd be great to at least have a build of the assistant library for RPi4 and the libc version installed there - deprecation with no alternatives and no plan is never a good thing.

@BlackLight Not sure if you've seen https://github.com/googlesamples/assistant-sdk-python/releases/tag/0.6.0, it introduces a new build of https://pypi.org/project/google-assistant-library/1.1.0/ which add the deprecation notice (#359), but also fix the segfault (#355) that was happening in the ctypes bindings. I'd be curious to know if it fixes your issue on RPI4.

blacklight commented 4 years ago

Thanks! With so many issues open around it's hard to keep track of all the progress :)

I've briefly tested the import of the library on a RPi4 over SSH and there's no segfault now (I haven't yet tested the hotword detection, but I'm assuming that if the ctypes were the only issue it should work). That should definitely fix things for now - while I'll still wait for a proper replacement for the library ;)

shivasiddharth commented 4 years ago

I can confirm the working of hotword detection with the latest Buster release. Seems like it was an issue with python 3.7 and ctypes. Even without #355, library works with python 3.6/python 3.5 on Buster.

shivasiddharth commented 4 years ago

@BlackLight , I would suggest you to take a look at https://github.com/Picovoice for hotwording as an alternative to snowboy.

tmigone commented 4 years ago

@BlackLight thanks for putting up the good fight for us devs 😆 What happened here is really unbelievable coming from google.

I noticed on your platypush project you are also using picovoice solutions for hotword detection. How does it compare to snowboy and google's deprecated lib? Just learnt that snowboy will shut down EOY, so we seem to be running out of good alternatives...

blacklight commented 4 years ago

@tmigone sad to hear that Snowboy is shutting down, that project has really been an interesting sparkle in the world of voice assistants...

I've recently written an article with the comparison of several integrations for voice assistants in platypush (Google library legacy vs. push-to-talk vs. AVS vs. Snowboy vs. Picovoice vs. Mozilla DeepVoice). I've been really impressed by the performance of Picovoice, both hotword detection and speech detection are almost on par with Google (except for Google's smarter context awareness), but so unimpressed by their choice of keeping most of the code closed-source, as well as the cumbersome commercial process they've put up (you need to apply for a commercial license to get most of their libraries for non-x86 architecture, and that involves the old-fashioned fill-the-form-and-we'll-reach-back approach). Unless they change their approach I've got more faith in DeepVoice - since it relies on a pure TensorFlow model it's also way more flexible, but it's still way too heavy for real-time detection in all of my tests.

Still, this could have been an amazing chance for Google to release a solid assistant SDK. It's quite a shame that they're letting some other solutions pick the baton.

proppy commented 4 years ago

@tmigone another hotwording alternative worth looking at is: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/micro/examples/micro_speech It has the advantage of running on microcontrollers.

I believe the original issue reported by @BlackLight is now adressed:

Feel free to reopen this issue (for file new ones) if you have more concerns.