DraqueT / PolyGlot

PolyGlot is a conlang construction toolkit.
MIT License
381 stars 44 forks source link

Integrate with ChatGPT API allowing for instant translation #1265

Closed DraqueT closed 1 year ago

DraqueT commented 1 year ago

Based on dictionary and grammar of existing polyglot files.

Users will have to obtain and set up their own ChatGPT API key.

Folks. This works terrifyingly well.

rcochrane55 commented 1 year ago

I'm curious to know how well this works if translating to your conlang from a language other than English. For my main project I tend to look for originally German-language content to translate because it is then easier to keep the syllable count and rhythm correct for music and poetry, and I'm wondering if chatGPT would struggle with this more than it does with translating to/from English.

I also would like to know whether this will be GPT-3.5 or GPT-4 based. If it works using the free GPT-3.5 version, then it might potentially have trouble reading everything from the grammar section. Because the grammar section currently lacks the ability to create tables, I have taken tightly-cropped screenshots of conjugation/declension tables from my grammar google doc and inserted them into the grammar section, but only GPT-4 is able to read images as far as I know, and it has limited availability right now and costs money. I'm wondering whether I will need to find another format to list out conjugation and declension tables to allow GPT to read them properly.

DraqueT commented 1 year ago

So far, I have only tested with English. I think a good metric for how well this will work in other languages is asking GPT to translate a phrase to the target language then back again. If it does a good job, then PolyGlot's new feature will likely work well with that as the base language.

I originally planned for this to work with GPT 3.5, but I believe that will be too restrictive. You get a maximum for 4k nodes with 3.5, and I think that the 8k (then later 32k) offered with GPT4 will work much better. Right now, I have a working prototype which functions with just 4k under GPT 3.5, but it feels very restrictive, and I have had to make significant compromises to the text that's fed in.

The initial release of this tool will be text only, despite the multimodality of GPT4, but I will likely be implementing the inclusion of images into betas shortly after the release of PolyGlot 4.0.

GPT 4.0 is in very limited beta right now, but I believe that it will be available for limited scope personal use with time. If not, then I will be forced to rethink how I will release this tool. No matter what, users will have to sign up for an API key to be able to use it and drop that into the settings of PolyGlot (although I will be including detailed instructions on how to do this, obviously). This is free, although there is currently a line to get access. I myself only have access to the API for GPT 3.5 at the moment.

I am considering the implementation of tables into the grammar section, but I unfortunately made sp,e bad decisions when I initially implemented it, and I have put myself into a corner where all of my options involve a lot more work than I would like before the tool can truly move forward.

If you're working with GPT much yourself, I would love to have a deeper discussion about your findings and experiences.

rcochrane55 commented 1 year ago

@DraqueT Another question occurs to me: I've seen a lot of posts in conlang communities I run in about trying to teach chatGPT a conlang, and having it work fine until you actually ask it to translate something and it starts translating to a (possibly unrelated) natlang. I frankly don't know a lot about the inner workings of GPT or the GPT API, but I am curious if this problem has come up in your testing of this feature. I'm curious if GPT has trouble with a posteriori conlangs, particularly ones that strongly resemble their related natlang(s) in some ways. I have a suspicion that if I tried to teach GPT my conlang it would start translating to German instead.