JabRef / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
3.62k stars 2.58k forks source link

Internationalization of Title Case / Sentence Case formatter #9144

Open calixtus opened 2 years ago

calixtus commented 2 years ago

When looking into #9142 , i saw that we currently only support english words for title case and sentence case formatters, like and or to. JabRef wont look for words in other languages like german ('und', 'zu') or french ('et', 'pour'), that should not be written capitalized in a title.

Should also work if JabRef is displayed in another language, E.g. JabRef is displayed in german, but is formatting an english title.

One would also have to find out, which languages know that concept of title case formatting.

sreenath-tm commented 2 years ago

As you mentioned the case handling is done only for english so if we support multiple language we might just need to add the extra words to our fields that we have labelled as smaller words or conjunctions. If we get the entire possible words for each of the categories Articles,Propositions and Conjunction for the supported languages ,we can add that also.

If anyone can help in identifying what all words are there for each particular classification for all the supported language we can add that

mlep commented 2 years ago

If we want to support case changes for multiple languages, we need to identify first the language of the string first. There seems to be libraries dedicated to this, such as https://github.com/shuyo/language-detection . And we could ask the people working on the localization of JabRef to provide the "small words".

Siedlerchr commented 2 years ago

I think this is too broad for the moment. Bibtex does not support multiple languages. That is only possible in biblatex.

mlep commented 2 years ago

I agree for the "too broad for the moment". However, if you provide the right casing (and a BibTeX style that does not alter it), BibTeX is ok with it, whatever the language of the string.

sreenath-tm commented 2 years ago

I feel the solution now would work for all the languages irrespective of having an explicit language detector. We might just need to handle them the same way we do for English and we can use the same fields having the extra articles,propositions and conjunctions for the respective languages.

delkc commented 1 year ago

This just got shared with me. Not sure if this issue is still in use, but it's best if we always use sentence case for page/section headers anyway.

Lattice content standards call for sentence case throughout, except for proper names/brand names.