joshbduncan / word-search-generator

Make awesome Word Search puzzles!
MIT License
65 stars 21 forks source link

Multilingual Support #60

Open zelequhen opened 5 months ago

zelequhen commented 5 months ago

I've been using this project to generate puzzles for my Kindergarten student, but thought it'd be very useful to generate puzzles in other languages too. After a brief experimentation it looks like Latin based languages should be fairly straight forward, but languages with other alphabets (like Russian, Arabic, Japanese [for Katakana/Hiragana], etc.) might require some teaks to the font. I'd like to handle this for the project if it falls inline with your future plans for it.

I noticed that you're working on a significant new version, should I wait for that to be completed before making the changes?

joshbduncan commented 5 months ago

Yeah, I love this idea! To be honest, I have very minimal experience working with other languages. I have included some localization in my Ai Command Palette tool, but that is it. If you want to tackle this, please go ahead. I would just base it off of the main branch since that already has the latest code that will be v4.0.0. Please, give me one day and I'll add some base code that you can expand upon.

I'll update you here when I have everything setup. Cheers!

zelequhen commented 5 months ago

Sounds good. I played with it a bit on Sunday and I think I got most of it worked out, just need to enable a Unicode font for the PDF generation and work through random words in the various languages. I suspect that I can have a PR ready by the end of the weekend, at least for a handful of languages, I'm thinking of targeting Spanish, French, and maybe German for an initial push and then extend to a few non-latin based languages in a follow on PR.

joshbduncan commented 5 months ago

I thought about this a bit yesterday and I'm not sure if this package needs an entire "language" feature... Really what I think it needs is just a way for users to submit custom alphabets. Users can already submit words from other languages (with a few caveats), so really the only thing the generator needs is a matching alphabet for filling in the filler characters. You are right in that they would need a proper font for exporting PDFs but I could expose a method for setting in the Formatter class.

Let me know how you are approaching it and let's see if we can work out the best path forward. Thanks!

zelequhen commented 5 months ago

My approach is to create language bundles that define the alphabet for that language and a set of "common words" so that users could take advantage of the random word feature. The language is picked via a CLI argument and the contents are then read and passed to the generator. I do like the idea of having an alphabet argument that could be specified instead, and that would help with languages not yet supported, but I also think having a convenience feature to not have to look up all the characters in an alphabet to include the variants with diacritical marks to be useful.

joshbduncan commented 5 months ago

Yep, your approach sounds well thought out and great. If this is something you want to see through please proceed and let me know if I can help out in anyway. Thanks!

zelequhen commented 5 months ago

I actually just got it working for two languages (I’ve using non ascii characters). I want to put in support for a few more languages which just requires grabbing words and defining the alphabet and I can have a pr ready. I should have that ready to go by next Sunday.

I did have to pull an open source font to use, but the license doesn’t look like it conflicts with use here.

joshbduncan commented 5 months ago

All sounds good. And I did some Googling around the other day and there some to be some font options that may work. I need to do a little research but I think the font needs to have an MIT license to match the repo.