AwesomeTTS / awesometts-anki-addon

AwesomeTTS text-to-speech add-on for Anki
GNU General Public License v3.0
481 stars 100 forks source link

Add service for Amazon Polly #31

Closed edu-zamora closed 6 years ago

edu-zamora commented 6 years ago

The best voice I have found by far for Swedish (the language I am learning) is the one from Amazon Polly.

It would be really convenient to be able to generate the audio from Anki itself (using AwesomeTTS) instead of having to copy paste the text into Amazon Polly's UI, generate the mp3, download it and add it to the proper field.

I guess the most complex part of it would be how to handle the authentication.

Let me know if you want me to provide more info :)

insanewhosane commented 6 years ago

I spent quite a few hours trying to do this but couldn't get it to work. I installed the boto3 library using pip and had it functioning using the test script provided by Amazon, but when I went to use a similar code in an awesometts plugin - Anki said it could not find the boto3 module. I would be willing donate at least $20 for this, and probably more later, as the voice quality is very good!! And I don't have much time to work on implementing it, unfortunately.

insanewhosane commented 6 years ago

$100 bounty here https://www.bountysource.com/issues/54199353-add-service-for-amazon-polly

artem7902 commented 6 years ago

I write some gui and solve the problem with boto3 library. But you need to do some dirty work to get it started. Just follow instructions:

  1. Install Anki 2.1
  2. Install Python 3.6
  3. Install Pip3
  4. Install Boto3 library using Pip3 sudo pip3 install boto3
  5. Download my version of awesometts-anki-addon and install using standard install.sh file
  6. Go to ~/.local/share/Anki2/addons21/folder-with-plugin/awesometts/service and open file amazon.py using text editor.
  7. There are two string before import boto3 /usr/lib/python3.6 and /usr/local/lib/python3.5/dist-packages. First - path to python3.6 in my case, second - path to pip3 packages in my case. Change this strings if you have different paths. It's very important

Also make sure that AWS credentials was set as described here

insanewhosane commented 6 years ago

Thanks I will try it this weekend!!

artem7902 commented 6 years ago

@insanewhosane Hi, please let me know did you try it? Maybe you have some troubles with installation?

insanewhosane commented 6 years ago

Tried this last night and had some issues. Will try again later today and let you know! Thanks!! Sorry for the delay!!

insanewhosane commented 6 years ago

@artem7902 Awesome!! It is working. Going to update my audio files and let you know if I have any bugs. I should post again in 24 hours or so. Thanks so much. Do I need to do anything on bounty source, or will the funds automatically be transferred?

artem7902 commented 6 years ago

@insanewhosane Happy to hear this. For bountysource you need to close this issue and then after my request confirm reward in bountysource. So test it completely and if an everything ok then close this issue.

insanewhosane commented 6 years ago

@artem7902 Generated >1000 audio files. Sounds great. Thanks so much for working on this. I think it might require @edu-zamora to close the issue, since he is the person who opened the issue? I don't have the option. Please let me know if I can do anything to expedite the process. Again, thanks so much for your help!!

artem7902 commented 6 years ago

@insanewhosane I save some solution for this. You can just create another similar issue and send email to bountysource with ask for transfer bounty to the new issue. It's really would be great if you did this. Anyway thanks for the feedback and for the interesting task ;)

insanewhosane commented 6 years ago

@artem7902 I contacted github to ask if there is another way to close the issue. If their response is that we must wait for @edu-zamora ...then I will definitely create a new issue and transfer the bounty.

Is there any chance of this making it into the master? As I am not sure what to do when awesometts is updated, so I can keep using the amazon polly service. If it will take many hours, but you think it will be possible, I can create an issue/bounty. Let me know what you think. Thanks again.

edu-zamora commented 6 years ago

@insanewhosane, @artem7902: I'll take a look at this today and, if there are no problems, close the issue :) Thanks for adding support for this, this is going to be great!

edu-zamora commented 6 years ago

@artem7902: I didn't manage to make it work.

When opening Anki, I get the next error:

screen shot 2018-04-27 at 13 09 30

Testing a bit, it seems an issue while importing boto3 in awesometts/service/amazon.py (as I don't get the error if I remove import boto3). The strange thing is that I have changed the paths to python 3 and to pip3 packages and that the next snippet of code (a bare bones version of awesometts/service/amazon.py) works without issues:

import sys

sys.path.append("/usr/local/Cellar/python/3.6.5")
sys.path.append("/usr/local/lib/python3.6/site-packages")
print(sys.path)
print(sys.version)
import boto3

print(dir(boto3)) # Find functions of interest.

response=boto3.client('polly').synthesize_speech(
    OutputFormat='mp3',
    Text="Hej, hej!",
    VoiceId="Astrid",
)
if response and response['AudioStream']:
    with open("demo.mp3", 'wb') as response_output:
        response_output.write(response['AudioStream'].read())

Do you have any idea what could be going on?

Thank you a lot!

edu-zamora commented 6 years ago

I don't have more time to look into this, but it seems it should be possible to ship boto3 as part of the addon itself, which would make this a lot more user-friendly.

You can take a look at this for some inspiration: https://stackoverflow.com/questions/44556065/shipping-part-of-python-standard-library

artem7902 commented 6 years ago

I think it's possible to insert boto3 into this addon. But also it requires many other dependencies. I try to do this soon and then let you know.

artem7902 commented 6 years ago

Ok, I made it. @edu-zamora @insanewhosane please check the new version and let me know after a test. I insert all dependencies into the plugin and add the path to python interpreter at runtime. I know that it's some kind of monkey patching, but as I see it's one of the best ways and also it works :) Also, I create pull request for transfer it to master.

edu-zamora commented 6 years ago

@artem7902: I had to tweak it a bit, but I managed to make it work for Anki 2.0.x 👍The only thing that still does not work for me is that when selecting a new Language in the UI, the Voice dropdown does not automatically update (I have to change the service and go back to AmazonPolly for the voices to show).

In any case, why don't we do it as all the other services are doing it, and show all voices in a unique dropdown? To understand what I mean, you can take a look at NeoSpeech, Oddcast or OS X Speech Synthesis.

Whenever I have more time, I'll push my Anki 2.0.x branch and try to fix the voices dropdown issue (if you didn't get to it first).

Again, thanks a lot! We are really close :)

edu-zamora commented 6 years ago

@artem7902: I am almost done converting the Amazon service to look more like the NeoSpeech one, so no need for you to work on it. I'll let you know once I have pushed it.

edu-zamora commented 6 years ago

@artem7902: Works like a charm now! :) I pull requested my master-2.0 branch to your repo, so you can take care of trying to merge both branches (master and master-2.0) upstream.

Also, it would be good if both branches look as similar as possible, so if you like my changes, try to apply them back to master.

Let me know if you need anything!

artem7902 commented 6 years ago

@edu-zamora Thanks a lot for contributing, especially for combo box fix. I just tried to split that big combo box using two combo boxes but it really inconsistent with another component. So I used your variant in master. About AWS credential's in the UI, I try to do this soon.

artem7902 commented 6 years ago

@edu-zamora I add fields for input credentials. Please if you have free time transfer it to 2.0.x and test how it works.

dholmb commented 6 years ago

Thanks guys, I don't know anything about programming but was able to follow your instructions and get it to work.

Could a lexicon file that's used on the normal Amazon Polly TTS website also be used within this app?

edu-zamora commented 6 years ago

@artem7902: Great work! I have included the changes into the master-2.0 branch, tested them and pull requested them into your repo.

@dholmb: Awesome that you got it to work! 👍What's a lexicon file and why would be useful to support it from the addon itself?

dholmb commented 6 years ago

@edu-zamora: Thanks for the reply.

I'm just starting to use TTS, but lexicon files are how you can edit the pronunciation of words. For example, I mostly use it with abbreviations or acronyms that I want to be said a certain way. There are a lot of others capabilities including a phonetic alphabet to correct pronunciation.

I did a little more research online and it looks this isn't something that would be inside the app, but stored under each user's AWS account. I think the service will then automatically apply the corrections when being used through the app.

https://www.w3.org/TR/pronunciation-lexicon/#S1.1 https://docs.aws.amazon.com/polly/latest/dg/gs-put-lexicon.html

artem7902 commented 6 years ago

@dholmb I realized what you mean. I add lexicon support, please check. You can input list of lexicons, just split their names with ", " for example "lexicon1, lexicon2, lexicon3". Use this link

dholmb commented 6 years ago

@artem7902: I wasn't able to get it to work but it's probably just on my end. I'll try to figure it out later. I'm assuming that the lexicons that are used need to be loaded onto my AWS account first?

artem7902 commented 6 years ago

@dholmb yes, of course, you need to configure it using AWS and then just use lexicon names. You can read about it in AWS documentation.

artem7902 commented 6 years ago

@edu-zamora if everything ok can you close this?

edu-zamora commented 6 years ago

@artem7902: Don't we want to wait until this is merged into master?

artem7902 commented 6 years ago

@edu-zamora There are PR's from the 2017 so I'm not sure that this will happen soon :)

edu-zamora commented 6 years ago

@artem7902: There you go then :)

insanewhosane commented 6 years ago

jmespath* is missing from the dependencies folder. Pretty awesome how you made installing it VERY easy. Thanks!

insanewhosane commented 6 years ago

@artem7902 I installed the latest version and was generating some audio files. It works great, but if some cards have empty fields (for example, cards have a question and answer only, with an empty explanation field) it crashes.

uni419 commented 5 years ago

@artem7902 @edu-zamora I've been messing around with installing the polly service and ran into two small issues.

  1. Every time I try and preview a sample I get an error message reading “Cannot preview the input phrase with these settings, you must specify a region”. I assume this means AWS region but i'm not sure how to modify the .py file to include my region.

  2. Much more minor, the new Chinese voice "Zhiyu" isn't included in the current drop down list in the UI. I thinking adding in a new line to the voices table as (‘cmn-cn’, 'female', ‘Zhiyu'), should do the trick

Thanks so much for all of your work on this and let me know if you have any proposed solutions