karashiiro / TextToTalk

Chat TTS plugin for Dalamud. Has support for triggers/exclusions, several TTS providers, and more!
MIT License
47 stars 30 forks source link

Community lexicons #43

Closed karashiiro closed 2 years ago

karashiiro commented 3 years ago

Migrated to #60 - please continue there!

I don't use lexicons myself, so I don't have one I'm maintaining, but if anyone else has lexicons they're willing to share I'd appreciate it if they could drop a link so I can provide them to anyone who wants them and doesn't know how to make them themselves. Alternatively, feel free to post them in the #preset-sharing channel in the goat place Discord, and I'll relink them somewhere here.

karashiiro commented 3 years ago

https://github.com/karashiiro/TextToTalk/wiki/Community-lexicons

johnysandels commented 3 years ago

Main Character names British Voice Lexicon.zip

Works only with British voices. Uses aliases because pronemes aren't currently working with the standard voice setting. Only corrects pronunciation on supporting characters names.

johnysandels commented 3 years ago

FFXIVCharacters&LocationsEN.zip fixed a mistake where two entries phonemes were swapped while testing

I've done all the characters I've noticed the most when going through MSQ. As well as Mispronounced Location names. These use pronemes so pronunciation should be more consistent though different regions of English. Works with all English voices. (tested for US and GB)

dedren commented 2 years ago

Is there an app or something easy to make these lexicons?

dedren commented 2 years ago

I am not sure if this should be a separate issue or not, but I tried using the FFXIVCharacters&LocationsEN.zip lexicon (both through Amazon Polly and directly uploaded to the plugin) and it said "Maximum lexicons size has been exceeded".

karashiiro commented 2 years ago

Is there an app or something easy to make these lexicons?

https://docs.aws.amazon.com/polly/latest/dg/gs-put-lexicon.html

I don't know of any apps for this, but this article has some lexicons used in its examples that might explain the concept.

I am not sure if this should be a separate issue or not, but I tried using the FFXIVCharacters&LocationsEN.zip lexicon (both through Amazon Polly and directly uploaded to the plugin) and it said "Maximum lexicons size has been exceeded".

I've never heard of this happening, but I assume that means Amazon Polly has some sort of size limit on lexicons. You can try splitting the lexicon in half, maybe? Pulling out half of the lexemes and putting them into a new lexicon file and uploading the resulting two smaller ones.

dedren commented 2 years ago

I've never heard of this happening, but I assume that means Amazon Polly has some sort of size limit on lexicons. You can try splitting the lexicon in half, maybe? Pulling out half of the lexemes and putting them into a new lexicon file and uploading the resulting two smaller ones.

Looks like you nailed it, as per Amazon Polly’s site, “Each lexicon can be up to 4,000 characters in size. ” I’ll have to wait until I have time to figure out how to separate them on an actual computer.

johnysandels commented 2 years ago

Looks like you nailed it, as per Amazon Polly’s site, “Each lexicon can be up to 4,000 characters in size. ” I’ll have to wait until I have time to figure out how to separate them on an actual computer.

Wow 4000 characters is quite a small limit, in the future I'll have to make split files for Amazon Polly. For now though, I removed a lexeme that can't be used right now with the way the plugin currently works, and the character count is now 3999! If there is any issue let me know! FFXIVLexiconPollyEN.zip

dedren commented 2 years ago

Oh damn, thank you so much! Hah, karashiiro saw the future and knew it needed to get under 4000 characters LOL. But seriously, thank you so much, I'll try it in a few, hopefully, after a few morning jobs. Also, do you recommend uploading it to Amazon directly or just uploading it to the addon?

johnysandels commented 2 years ago

You'll need to upload it through the TextToTalk plugin.

johnysandels commented 2 years ago

FFXIVCharacters&LocationsEN.zip

johnysandels commented 2 years ago

FFXIVCharacters&Locations.zip

karashiiro commented 2 years ago

FFXIVCharacters&Locations.zip

* Fixes pronunciation for Urianger's name for Microsoft David. Hopefully works with other voices (confirmed working for Zira atleast). I haven't been able to test others since I reinstalled windows. Under 4000 characters btw

That zip looks empty 👀

johnysandels commented 2 years ago

FFXIVCharacters&Locations.zip lets pretend you didn't see that 👀

johnysandels commented 2 years ago

--------Update: Added plurals to the new additions--------

Fixed pronunciation for the word Aetheryte and one of the expansions location names. There will be more updates coming as I play through the expansion! Glad to have TTT on the first day I've been able to play the story!

Also I've split the polly zip version into two lexicon files to respect the 4000 character limit.

FFXIVCharacters.Locations.zip

FFXIVCharacters.Locations.Polly.zip

dedren commented 2 years ago

First thing, you are awesome for doing this AND dealing with Polly's limit! However, how did you get this plugins to even work? My client won't alow plugins to load yet.

karashiiro commented 2 years ago

The staging version of Dalamud is available for use, but it's unstable at the moment, so it hasn't been recommended for use, yet. Enabling it without having a normal working version requires modifying the Dalamud configuration manually, but I'm not sure exactly which option enables it. Probably one of the ones labeled Testing at the top.

image

karashiiro commented 2 years ago

Oh right, it's DoDalamudTest, duh. I forgot because I have Dalamud disabled in the launcher for the moment.

dedren commented 2 years ago

Thank you, I was able to use that to google and found the FAQ page saying: Q: How do I turn Dalamud Staging on or off? In game (if you can still launch) Type /xldev in game. Click on the Dalamud menu on the top of the screen. Select/Unselect the settings as wanted. Relaunch the game Out of game (when you get crashes) Close the game Go to %appdata%\XIVLauncher\ and open dalamudConfig.json in your text editor of choice Change the line that says "DoDalamudTest": to true to enable Dalamud Staging and to false to disabled Dalamud Staging. Save Launch the game. NOTE: You may have to wait for Dalamud to be redownloaded.

When I do that however and start the game I still get the message that plugins are not enabled. I have to wait on queue to see if they actually load ingame.

karashiiro commented 2 years ago

I'm not sure exactly what else it requires, but you should be able to enable plugins from the plugin installer once you're in. At the least, you can know if you've done it correctly because there'll be a big red "Dalamud Staging" label in the upper-left portion of the screen, until you load in.

dedren commented 2 years ago

there'll be a big red "Dalamud Staging" label in the upper-left portion of the screen, until you load in.

AH thank you! That helped me tremendously. Turns out I turned on 'Plugin Testing' not dalamud :D I now have the label in the corner!

johnysandels commented 2 years ago

Capitalized Hydaelyn so it will actually work now 😅

FFXIVCharacters.Locations.zip FFXIVCharacters.Locations.Polly.zip

ryankhart commented 2 years ago

@johnysandels What language are you making the Lexicon for?

If it's English, the official pronunciation of Yugiri, according to the voice acting, is You-gear-ee, not You-gid-ee.

But maybe you experience the game in Japanese and they pronounce it differently?

Despite this, I greatly appreciate the work that you've put into this Lexicon. It has made my overall experience that much better than before.

johnysandels commented 2 years ago

If it's English, the official pronunciation of Yugiri, according to the voice acting, is You-gear-ee, not You-gid-ee.

But maybe you experience the game in Japanese and they pronounce it differently?

Oh I play in english! I based the pronunciation on how people native to Doma says her name, since it seems like the more genuine way to pronounce it. Seems like the people who aren't from Doma say it in a more western way. I specifically tried matching Hien's and Gotsetsu's pronunciation.

Possible spoilers for stormblood in videos. Hein: https://youtu.be/KbEsvkbeuo4?t=67

first example I found from Gotsestu https://youtu.be/DGE_X8GfvHI?t=267

ryankhart commented 2 years ago

I based the pronunciation on how people native to Doma says her name

Oh, that must be why. I haven't yet reached Doma yet in Stormblood.

Trixemyar commented 2 years ago

Hello there, let me start out by saying a big thank you for all the work you put into making lexicons, I myself have no knowledge about how this stuff are done but really like how people like you help bring the game alive for people like me. Now to the issue I face, so i'm using polly to help bring TTT to life, but some of the beast tribe names are being read all wrong, i'm pretty early into the game still at ARR however the 2 i noted are "Ixal" and "amalj'aa" will this be possible to fix? Thanks again for all the hard work you put into this 👍

ryankhart commented 2 years ago

@Trixemyar If the IPA notation Johnysandels uses intimidates you like it does for me, try using aliases instead and just use trial and error to trick it into the correct pronunciation. The easiest way that I've found to test this through trial and error is to use Amazon's page for it here: https://us-west-2.console.aws.amazon.com/polly/home/SynthesizeSpeech

<lexeme>
   <grapheme>Ixals</grapheme>
   <alias>Icksals</alias>
</lexeme>
<lexeme>
   <grapheme>Amalj'aa</grapheme>
   <alias>Amaldja</alias>
</lexeme>

I'll add this to my personal lexicon that I'm compiling, and post it here in a bit. It's far from comprehensive. I just add things as I play and hear odd mispronunciations.

ryankhart commented 2 years ago

ryanslexicon.zip

ryankhart commented 2 years ago

I'd be willing to make pull requests to this repo if community lexicons were put under source control. I know basic git. And the scope of this community lexicon project is small enough for me to wrap my head around it. I might even be willing to help merge future community contributions posted here by people who don't want to bother with git. I can specify the github username of contributors in the commit notes as well as in XML comments.

johnysandels commented 2 years ago

I'll have those added later tonight! Just as soon as I get off work 😎

@Trixemyar If the IPA notation Johnysandels uses intimidates you like it does for me, try using aliases instead

And true about aliases! I started off using alaises because they were easier to understand but started using phonemes because I noticed that different regions voices pronounce aliases differently. Pronemes are much more consistent through different regions, so I learned how they work to make a lexicon that is more universally applicable!

Also pronemes are much more simple than you would think. I use this to get the phonetics of words that are similar to the word that I'm trying to make, then this to test out the pronoucation. I tend to Frankenstein other words to piece together the word!

karashiiro commented 2 years ago

I'd be willing to make pull requests to this repo if community lexicons were put under source control. I know basic git. And the scope of this community lexicon project is small enough for me to wrap my head around it. I might even be willing to help merge future community contributions posted here by people who don't want to bother with git. I can specify the github username of contributors in the commit notes as well as in XML comments.

I think this is a good idea, I can set this up if you’re willing to help maintain it 👀 Having at least some lexicons be maintained in the repo itself should ensure that there are always updated lexicons to use, even if their original authors are unavailable.

ryankhart commented 2 years ago

if you’re willing to help maintain it 👀

👀 is right! But I'm always wishing I had something I could contribute to that I have the skills for that nobody else is already doing. And this seems like a good fit for me. I'll let you decide where you want to store the files initially, and I can update it from there over time.

johnysandels commented 2 years ago

I'll still keep updating here then since I'm not familiar with Git! I also personally think it would be really cool to either have a lexicon either included with TTT that users can choose to enable or the ability to download the lexicons within the plugin, since it would be easier to access for the average user! or maybe just a link to the place they could download and why they might want to use one!

ryankhart commented 2 years ago

Oh, I found the place where the default folder location can be set to a folder that contains various included Lexicon XML files. https://github.com/karashiiro/TextToTalk/blob/bd0b850ddce8956f363a0a49a9061faaec045f0c/TextToTalk/OpenFile.cs#L16

And this is the place where the UI appears to be set. https://github.com/karashiiro/TextToTalk/blob/d5134bb43799c4814e78d79e7f9facd773b98c8c/TextToTalk/Backends/Polly/AmazonPollyBackend.cs#L178

Now, I'm just wondering if there's a risk, when updating the plugin, if updates will overwrite user-placed lexicons there or not. I'm not sure how that would work out of the box. I know user preferences are preserved after updates, but it would be more ideal for updates to replace the provided lexicon XML files and not touch new files added by users.

That's just me thinking aloud, brainstorming. I don't expect a response. I may be able to mess with that code locally to see how that works if I can actually manage to figure out how to compile a relatively large code project (compared to what I'm used to for university assignments).

johnysandels commented 2 years ago

I have an update ready for Ixal and Amalj'aa and the other beast tribes but some plugin bugs need to be ironed out before they will work. It'll need a fix for issue #58 and #48 before they can work!

Trixemyar commented 2 years ago

@ryankhart @johnysandels Thank you ones again for all the work you put in. it really does mean a lot, personally my experience with this game will suck without this addon because i'm dyslexic i take for ever to read all the text in this game so one of the 1st things i did was look up a work around. I'll keep my ears open for any errors i can pick up and report. Behalf of all who have issues reading i thank you all :D

karashiiro commented 2 years ago

Oh, I found the place where the default folder location can be set to a folder that contains various included Lexicon XML files.

(snip)

Now, I'm just wondering if there's a risk, when updating the plugin, if updates will overwrite user-placed lexicons there or not. I'm not sure how that would work out of the box. I know user preferences are preserved after updates, but it would be more ideal for updates to replace the provided lexicon XML files and not touch new files added by users.

That's just me thinking aloud, brainstorming. I don't expect a response. I may be able to mess with that code locally to see how that works if I can actually manage to figure out how to compile a relatively large code project (compared to what I'm used to for university assignments).

This is not the case. What you're looking at in OpenFile is the initial directory that the file dialog is viewing. Without setting this, the file dialog would start you out at the filesystem root, which would be inconvenient for most people, since they'd then need to navigate all the way down to wherever they've saved the lexicon file. Lexicon files aren't touched on plugin updates.

As for compiling the project, make sure you either clone Dalamud and build that as well, or update TextToTalk.csproj with paths to your installed Dalamud libraries. The size of the project shouldn't change anything compared to what you're used to (at least afaik, we didn't do anything special in 142/143), but the external dependency is something to look out for.

The expected directory structure is something like:

/whatever
|-Dalamud
| |-bin
|   |-Debug
|-TextToTalk
karashiiro commented 2 years ago

Here's the documentation for OpenFileDialog: https://docs.microsoft.com/en-us/dotnet/api/system.windows.forms.openfiledialog?view=windowsdesktop-6.0

karashiiro commented 2 years ago

Dalamud might be a bit complicated to compile, actually 🤔 if you're not familiar with submodules and don't have the C++ build tools in Visual Studio installed. It's best to figure all that out for future reference, but if you can't you might just want to update the TextToTalk project file (remember not to commit it if you PR anything).

karashiiro commented 2 years ago

I'll still keep updating here then since I'm not familiar with Git! I also personally think it would be really cool to either have a lexicon either included with TTT that users can choose to enable or the ability to download the lexicons within the plugin, since it would be easier to access for the average user! or maybe just a link to the place they could download and why they might want to use one!

I can add you as a collaborator and you'll be able to edit the wiki, if you aren't familiar with Git 👀

johnysandels commented 2 years ago

I can add you as a collaborator and you'll be able to edit the wiki, if you aren't familiar with Git 👀

Ouh yes pls!

karashiiro commented 2 years ago

Sent requests @ryankhart @johnysandels

karashiiro commented 2 years ago

I have a draft format specified in https://github.com/karashiiro/TextToTalk/tree/main/lexicons. I'm not super sure about it, but I think it'll work?

karashiiro commented 2 years ago

I'd like to migrate this to Discussions, unless anyone's opposed 👀 we can go off-topic more easily that way 🙂

karashiiro commented 2 years ago

Oh, actually, I can convert this, I think.