Picovoice / eagle

On-device speaker recognition engine powered by deep learning
Apache License 2.0
26 stars 2 forks source link

C and Python: Demos and binding v0.1 #1

Closed mrrostam closed 1 year ago

mrrostam commented 1 year ago

The Github actions can only be executed on Ubuntu and are limited to using the Linux library file, as not all library files are currently available and it is necessary to wait for my pull request to be merged in order to obtain all artifacts.

PS-1: @laves @ksyeo1010 @ErisMik @dominikbuenger : Github doesn't let me to assign all you guys at the same time as reviewers because the repo is private, but I believe you still can leave your comments. If it's not the case, let me know, and we can explore alternative methods for accomplishing this.

PS-2: It turns out private repos dont have access to the organization secrets. For action I'm using a plain AccessKey string for now, which will be replaced after making Eagle public.

PS-3: Although I'm still working on the zoo-dev side and anticipate further modifications to the library and parameter files, but feel free to give Eagle a try and share your ideas/opinion/experience with it.

PS-4: @bejager Hey, I completely forgot that you're also part of the Picovoice organization on GitHub. My bad! I would love to hear your thoughts and comments too.

mrrostam commented 1 year ago

I built and ran the demos and found a few more minor things. Now I'm done for real. Looks really good :)

:) All good, in fact, I really appreciate all the comments and subtle changes you guys mentioned. They have greatly improved this PR. Thank you!

bejager commented 1 year ago

Just played around with the demo and it looks great! I intentionally did not read the README to get a "first-time experience" and everything was quite clear to me.

One super minor thing I found slightly confusing was in the enrollment part. When we only use one of the enroll_1.wav files we get this message:

Enrolled audio file resources/audio_samples/enroll_1.wav [Enrollment percentage: 97.00% - Enrollment feedback: Good audio]
[ERROR] Cannot export speaker profile before enrollment is complete

It makes sense, of course, but maybe we could make it a bit clearer somehow, e.g. by letting the user know what they can do about the failed enrolment. E.g. simply mentioning that they can add enother audio file to enroll_audio_paths or adding another printout after Enrolled audio file... printout as a sort of summary of the enrollment. Not sure how relevant you think this is.

mrrostam commented 1 year ago

Just played around with the demo and it looks great! I intentionally did not read the README to get a "first-time experience" and everything was quite clear to me.

One super minor thing I found slightly confusing was in the enrollment part. When we only use one of the enroll_1.wav files we get this message:

Enrolled audio file resources/audio_samples/enroll_1.wav [Enrollment percentage: 97.00% - Enrollment feedback: Good audio]
[ERROR] Cannot export speaker profile before enrollment is complete

It makes sense, of course, but maybe we could make it a bit clearer somehow, e.g. by letting the user know what they can do about the failed enrolment. E.g. simply mentioning that they can add enother audio file to enroll_audio_paths or adding another printout after Enrolled audio file... printout as a sort of summary of the enrollment. Not sure how relevant you think this is.

Good idea! Added to both c and python demos

mrrostam commented 1 year ago

Taking @dominikbuenger's suggestion into account, I have made modifications to the input arguments parser of the Python demos. By utilizing the parent feature, the order of input arguments has been enhanced to provide a more intuitive and user-friendly experience.