Closed forthrin closed 2 years ago
The issue with this is that there is no API to spell-check, but basically everything else (selecting lines, highlighting, changing words) is possible.
The same happens for me when programming or writing translation files where the variables are in English and the strings are in a different language.
As I write both in English and Portuguese, I combined the English Dictionary with the Portuguese dictionary. So, now I got spell checking on both languages. You may find this dictionary here:
@evandrocoan I just did this with my Norwegian and English. 322044 lines + 470122. Took me 5 minutes to c+p it with kate on linux. Lol
Can you please briefly but precisely describe how do you that? Im interested Thanks!
EN_US.txt
as EN_US_MY_LANG.txt
EN_US.aff
as EN_US_MY_LANG.aff
EN_US.dic
as EN_US_MY_LANG.dic
MY_LANG.txt
and append its contents on EN_US_MY_LANG.txt
.MY_LANG.aff
and merge its contents on EN_US_MY_LANG.aff
using your intelligence.MY_LANG.dic
and append its contents on EN_US_MY_LANG.dic
and update the EN_US_MY_LANG.dic
first file line with the correct number of words on this new file.EN_US_MY_LANG
as your default spelling check language.Now you got spelling on 2 languages. But there are some downsides:
.aff
you need to take care on how you do it otherwise it may crash Sublime Text.@titoBouzout In linux, I used the cat command with shell redirection (>) into my output file: english.dic norwegian.dic > output.dic. If use my standard text editor, it crashes. Lshell did it in milliseconds. I suggest using Lshell or Cmd.
Thanks for the explanation! Ill try :)
Thank you, @evandrocoan!
I'll try to install your EN_PT dictionary here!
I can not to merge Russian and English dictionaries, I get bad results. See part of answer of hunspell contributor:
Bilingual spellchecker is not supported, at least not in reliable way. Merging dictionaries should be out of question.
Thanks.
@dimztimz is correct on this:
Instead, at API level you can instantiate multiple objects of the spellchecker with different languages. Then you can check the word in each object. This is the most reliable way for now.
Therefore as it is to be performed by the Sublime Text spell checking core. So we need to wait for them to implement this feature for the best functionality.
Now I got good results with some disadvantages merging the EN _ PT dictionaries. However these two languages are pretty similar. For English and Russian, should not be easy to merge them, if it is possible.
More programs use Hunspell as Sublime Text. Can try to find extension for another program. I try use in Sublime Text Firefox Russian-English Bilingual addon, and it successfully worked for me.
But for single language spellchecking package LanguageTool — the best solution with many nice features.
Thanks.
And I thought I had it with
{
"dictionary":
[
"Packages/Language - English/en_US.dic",
"Packages/Language - Other/Portuguese (European).dic"
]
}
Alas, no.
Fixed in build 4123. "dictionary"
can now be provided a list.
Fixed in build 4123.
"dictionary"
can now be provided a list.
This doesn't work very reliably, please see a quick check of different dictionary combos taken from here below and note how the en_US.dic
is a spoiler , though only for the Russian one :)
(changing the order of dictionaries doesn't seem to matter)
This is my complete settings file in a new portable Sublime on Windows, scroll horizontally to see the results for different dictionary combos
{
"ignored_packages":["Vintage",],
"spell_check": true,
"dictionary": [
"Packages/Language - English/en_US.dic", // 1
"Packages/Language - English/en_GB.dic", // 2
"Packages/User/Dictionaries/German_de_DE.dic", // 3
"Packages/User/Dictionaries/Russian.dic", // 4
]
}
// Ln Text 1US 2GB 3DE 4Ru 1+2 1+3 1+4 2+3 2+4 3+4 1+3+4 2+3+4 1+2+3+4
// US "Is htis in colors? That's insane!" + + * * + + + + + * + + +
// GB "Is htis in colours? That's insane!" + + * * + + + + + * + + +
// De "Rechtschreibe/stylistsische Fehler" * * + * * + * + * + + + +
// Ru "Превед, медвед, как дела" - * * + - - - * + + - + -
// + highlights spellchecking errors in the Row's language (from the perspective of the Column's dictionary/ies)
// * same as +, but for a mismatching language (~the whole line is highlighted)
// - nothing is highlighted, even spellchecking errors
It's working fine here with the same dictionaries:
And by "working fine" you mean that it's not highlighting any spelling mistakes in the Russian line? All the example lines contain spelling errors, so no line should ever be free from red no matter the dictionary/ies
I think it's due to the first wrong line in the dictionary affix file SET ISO8859-1
, it should be UTF8
.
Not sure if the dictionary has to be regenerated, it doesn't seem to be saved in UTF8, so likely yes, though a simple text replace seems to be fine and seems to fix the surface issue of no highlights in the Ru line (otherwise haven't done any tests re. how well any of these combos work)
Out of curiosity, are you combining the language files behind the scenes (having to deal with different affix schemes) or are you using a simpler API and send each text to each dic and then combine the results it somehow?
Ah yes, the en_US dictionary seems to be wrongly configured. It's unrelated to this issue, but I'll put in a fix.
Out of curiosity, are you combining the language files behind the scenes (having to deal with different affix schemes) or are you using a simpler API and send each text to each dic and then combine the results it somehow?
There's no way to combine the languages (easily), so we just check if a sub-word is spelt correctly according to any of the listed dictionaries.
Upon further inspection we just seem to be handling the encoding wrong.
Como escrevo tanto em inglês quanto em português, combinei o Dicionário de inglês com o dicionário de português. Então, agora eu tenho verificação ortográfica em ambos os idiomas. Você pode encontrar este dicionário aqui:
você ainda tem esse Dicionário ?
Movi para este link: https://github.com/evandrocoan/LanguageEnglishAndPortuguese
Is this consider to be solved? Because I still have problems with it.
I have also tried with a UTF8 dic
, no luck 😕
@stdedos the encoding problem was fixed in build 4125, so yes this should all be working.
I am using https://github.com/titoBouzout/Dictionaries/blob/master/Greek.dic with
"dictionary": [
"Packages/Language - English/en_US.dic", // 1
"Packages/Language - English/en_GB.dic", // 2
"Packages/Greek.dic", // 3
"Packages/Greek_UTF8.dic", // 3
"Packages/User/Greek.dic", // 3
],
and text
What is Lorem Ipsum?
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
Γιατί το χρησιμοποιούμε;
Είναι πλέον κοινά παραδεκτό ότι ένας αναγνώστης αποσπάται από το περιεχόμενο που διαβάζει, όταν εξετάζει τη διαμόρφωση μίας σελίδας. Η ουσία της χρήσης του Lorem Ipsum είναι ότι έχει λίγο-πολύ μία ομαλή κατανομή γραμμάτων, αντίθετα με το να βάλει κανείς κείμενο όπως 'Εδώ θα μπει κείμενο, εδώ θα μπει κείμενο', κάνοντάς το να φαίνεται σαν κανονικό κείμενο. Πολλά λογισμικά πακέτα ηλεκτρονικής σελιδοποίησης και επεξεργαστές ιστότοπων πλέον χρησιμοποιούν το Lorem Ipsum σαν προκαθορισμένο δείγμα κειμένου, και η αναζήτησ για τις λέξεις 'lorem ipsum' στο διαδίκτυο θα αποκαλύψει πολλά web site που βρίσκονται στο στάδιο της δημιουργίας. Διάφορες εκδοχές έχουν προκύψει με το πέρασμα των χρόνων, άλλες φορές κατά λάθος, άλλες φορές σκόπιμα (με σκοπό το χιούμορ και άλλα συναφή).
Where does it come from?
Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.
The standard chunk of Lorem Ipsum used since the 1500s is reproduced below for those interested. Sections 1.10.32 and 1.10.33 from "de Finibus Bonorum et Malorum" by Cicero are also reproduced in their exact original form, accompanied by English versions from the 1914 translation by H. Rackham.
Που μπορώ να βρω μερικές;
Υπάρχουν πολλές εκδοχές των αποσπασμάτων του Lorem Ipsum διαθέσιμες, αλλά η πλειοψηφία τους έχει δεχθεί κάποιας μορφής αλλοιώσεις, με ενσωματωμένους αστεεισμούς, ή τυχαίες λέξεις που δεν γίνονται καν πιστευτές. Εάν πρόκειται να χρησιμοποιήσετε ένα κομμάτι του Lorem Ipsum, πρέπει να είστε βέβαιοι πως δεν βρίσκεται κάτι προσβλητικό κρυμμένο μέσα στο κείμενο. Όλες οι γεννήτριες Lorem Ipsum στο διαδίκτυο τείνουν να επαναλαμβάνουν προκαθορισμένα κομμάτια του Lorem Ipsum κατά απαίτηση, καθιστώνας την παρούσα γεννήτρια την πρώτη πραγματική γεννήτρια στο διαδίκτυο. Χρησιμοποιεί ένα λεξικό με πάνω από 200 λατινικές λέξεις, συνδυασμένες με ένα εύχρηστο μοντέλο σύνταξης προτάσεων, ώστε να παράγει Lorem Ipsum που δείχνει λογικό. Από εκεί και πέρα, το Lorem Ipsum παραμένει πάντα ανοιχτό σε επαναλήψεις, ενσωμάτωση χιούμορ, μη κατανοητές λέξεις κλπ.
half lights up like a Christmas tree.
It's working fine here with the linked dictionary. I suggest double checking you've correctly installed that dictionary - both the aff and dic files are required and must not have their encodings modified.
Hi @BenjaminSchaaf
Can you tell me what is wrong with the Ukrainian dictionary?
When I add to the dictionary
list, English one stops working.
{
"dictionary": [
"Packages/Language - English/en_US.dic",
"Packages/User/Dictionaries/uk_UA.dic",
],
}
Thanks!
Sublime Text v4134 Windows 10
@ihor-oleks I suggest making a separate issue, thanks.
I use Sublime Text to write fiction in which the same text file may contain several languages, especially in lines spoken by characters of different nationalities, for example Norwegian, English and Japanese.
When using spell checking, it seems I must settle on a single language, which means text in other languages will be marked as "incorrectly spelled" and marked in red. Annoying and confusing.
What would be the best way to overcome this? Can support be added for multiple dictionaries per file, for example?
If it makes a difference, I do most of this writing in the Fountainhead package, if that makes it easier to come up with a solution (such as handling spoken lines in a particular way.)