Support internationalization

koppor commented 3 months ago

TL;DR on internationalization: https://www.baeldung.com/java-8-localization

Risen by @ThiloteE at https://github.com/JabRef/jabref/commit/2d64d01b30c39219df378fb65f4aeb2715c5a78b#r145145663.

Moving that to Week 12, because this is a hard one.

ThiloteE commented 3 months ago

Easiest solution to support internationalization: Expose current hardcoded prompts in the UI under preferences and make them editable. Users will be able to translate the text to their own language.

koppor commented 3 months ago

Easiest solution to support internationalization: Expose current hardcoded prompts in the UI under preferences and make them editable. Users will be able to translate the text to their own language.

Partially implemented.

A good switch if I am in need of a German summary today and an English one tomorrow. Maybe even after five minutes. - Could be done by manual prompt entering, but better if JabRef had sensible defaults and a good switch possibility.

InAnYan commented 3 months ago

A good switch if I am in need of a German summary today and an English one tomorrow.

I think it would be enough if you ask LLM explicitly about that. Though, an experiment is needed

ThiloteE commented 3 months ago

Models trained and finetuned on a specific language have a HIGH likelyhood to respond in that particular language, even when asked to otherwise. Every word will skew the response towards a certain language. That means hardcoding English will make responses in other languages less likely and responses in English more likely. To make minor languages work, you want to use every little advantage you can get. And with minor langauge, i am talking about every language, apart from English and Chinese. Even if the model responds in the correct language, overlapping activations in the weights will skew model responses into a particular direction.

I have experimented with open weight models in GPT4All. I tried German models and Chinese models. While the better models out there do a better job nowadays, the problem exists "under the hood".

I am very much in favour of not hardcoding.

InAnYan / jabref

Support internationalization #140