Osmodium / W40KRogueTraderSpeechMod

MIT License
2 stars 3 forks source link

Feature Request: Integrate Text-to-Speech AI for Better User Experience #8

Open magyargergo opened 1 month ago

magyargergo commented 1 month ago

Summary

First of all, thank you for the outstanding work on this mod. It's impressive, and I appreciate the effort and dedication that has gone into its development.

Let's jump straight to the point, I propose improving the Text-to-Speech (TTS) functionality of this mod by integrating AI libraries. This could improve voice quality and provide a more immersive experience for players.

Current Situation

The mod currently uses the NaturalVoiceSAPIAdapter for TTS. While it functions adequately, there are other libraries available that offer higher-quality voices and additional features.

Proposed Libraries

1. Microsoft Azure Speech SDK

2. Amazon Polly

3. Google Cloud Text-to-Speech

Benefits of Integration

Additional Considerations

Conclusion

Integrating one of these advanced TTS libraries could significantly enhance the player experience. I'm interested in contributing to this effort and would be happy to assist with research or development if this aligns with your project's goals.

Thank you for considering this feature request!

gexgd0419 commented 2 weeks ago

NaturalVoiceSAPIAdapter supports the Azure Speech service.

You can use the online Azure voices if you have an Azure account, create a speech resource, then enter the key and region in the installer dialog.

You can create an Azure account for free, and if you choose the free tier while creating the speech resource, there will be no cost. The quota is 0.5 million characters per month, which may be enough for a single user, but not if shared with every user.

So I think that adding support for those speech SDKs would be great, but users should use their own accounts and keys, and pay for their own usage.