Open MiaSelene opened 4 months ago
Sorry, this doesn't help you, but it might be one day, so I just want to share what I've learned from fine-tuning English.
After about 8 fine-tune runs, and about 2 weeks of training data I started getting the start of multi word correction suggestions in the form of hyphenated words from the LLM.
Thus, if I set a=max, I'd only be using the LLM and addressing your concern.
Unfortunately, German doesn't have access to all this yet because the LLM settings only mention English. I wonder how to train models for these other languages? What's the workflow?
Sorry, this doesn't help you, but it might be one day, so I just want to share what I've learned from fine-tuning English.
After about 8 fine-tune runs, and about 2 weeks of training data I started getting the start of multi word correction suggestions in the form of hyphenated words from the LLM.
Thus, if I set a=max, I'd only be using the LLM and addressing your concern.
Unfortunately, German doesn't have access to all this yet because the LLM settings only mention English. I wonder how to train models for these other languages? What's the workflow?
Somewhat related to #380
Since the transformer lexicon is so big, sometimes exact matches are still nonsensical predictions in context, particularly for short words. Sometimes this happens when the words aren't even exact match. I wanna recommend putting more weight on text prediction rather than letter closeness
Example, German language Keyboard Was ISF das für ein Gier?
Where both ISF and Gier are nonsense compared to intended words "ist" and "Tier"