tmalsburg / guess-language.el

Emacs minor mode that detects the language you're typing in. Automatically switches spell checker. Supports multiple languages per document.
115 stars 14 forks source link

Cursor jumps when text is analysed #5

Closed peterwvj closed 7 years ago

peterwvj commented 7 years ago

I've noticed that the cursor jumps around while text is being analysed by guess-language. Although this is barely noticeable for small paragraphs, it becomes easier to see for larger ones. I've managed to produce this issue using a rather small Emacs configuration, which I store in ~/Desktop/tmp/init.el:

;; ~/Desktop/tmp/init.el
(package-initialize)

(autoload 'flyspell-mode "flyspell" t)
(setq ispell-dictionary "british")
(setq ispell-current-dictionary "british")
(add-hook 'text-mode-hook 'flyspell-mode)

(require 'guess-language)
(setq guess-language-languages '(en da))
(setq guess-language-min-paragraph-length 35)
(add-hook 'text-mode-hook (lambda () (guess-language-mode 1)))

Now, open an empty file (e.g. ~/Desktop/tmp/Demo.txt) in Emacs:

emacs -Q -l ~/Desktop/tmp/init.el ~/Desktop/tmp/Demo.txt

Initially, Emacs uses the british dictionary.

First, I start typing a Danish sentence. Eventually the number of characters becomes 35, which triggers the guess-language analysis. While the analysis is being performed, the cursors jumps between the "misspelled" words. Based on this analysis guess-language correctly switches the dictionary to danish.

Another way to make the cursor jump is by pasting a large chunk of text that contains several "misspelled" words (according to the current dictionary). For example by deleting the danish sentence, and pasting a large chunk of "Lorem Ipsum": since none of these words are included in the Danish dictionary this causes the cursor to jump a lot.

Below is a GIF that illustrates the scenario explained above:

optimised

I had to reduce the quality of the GIF - otherwise Github would not allow me to upload it. Let me know if the quality of the GIF is a problem.

System info: GNU Emacs 26.0.50.2 (x86_64-pc-linux-gnu, GTK+ Version 3.18.9) of 2017-02-01 guess-language 20170204.311

joostkremers commented 7 years ago

Yes, I noticed the same thing. I guess it's due to the use of how-many in the function guess-language? Would it be possible to put that part of the function in a temp buffer?

tmalsburg commented 7 years ago

Detecting the language is really really fast and produces no visual side-effects at all. You can try using M-: (guess-language-paragraph) which just detects the language and does nothing else. What you're seeing is flyspell checking the paragraph after the dictionary is changed. You get the same effect when you do M-x flyspell-buffer. So the problem is not caused by guess-language but by flyspell.

There are two possible solutions: 1.) We can switch the language without rechecking the paragraph. This would mean, however, that the first words in the paragraph would remain marked as being incorrect even when they are not. 2.) Perhaps, there is a way to teach flyspell to do spell-checking without moving the cursor around.

Solution 1 is easy but rather unsatisfying. Solution 2 would be clean but I don't know how difficult it is. Does flyspell have an issue tracker or mailing list?

joostkremers commented 7 years ago

Yeah, I put the function in a with-temp-buffer and noticed it hadn't anything to do with the visual effect.

Flyspell comes with Emacs, so I guess emacs-devel would be the first place to ask.

I'm not sure solution 1 would be useful, because I get the effect when I open a text file: the first paragraph, or first two paragraphs, display the effect when I move point through them. Strangely enough, it doesn't seem to happen lower down in the text.

tmalsburg commented 7 years ago

Not sure I understand your argument against solution 1. Anyway, the cursor-effect doesn't happen in later paragraphs because rechecking is done only on the current paragraph and when a new language is detected.

joostkremers commented 7 years ago

Not sure I understand your argument against solution 1.

Probably because of a misunderstanding on my part. Forget I said it.

Anyway, perhaps solution 1 could be user-configurable behaviour?

peterwvj commented 7 years ago

Thanks for your input. Apparently flyspell distinguishes between small and large buffers, which are spell-checked using different strategies. By default, buffers of < 1000 characters er considered small. So I tried (setq flyspell-large-region 1) to make sure that all buffers are considered large. Now I don't really notice the cursor jumping anymore. Let me know what you think.

joostkremers commented 7 years ago

Anyway, perhaps solution 1 could be user-configurable behaviour?

On a related note, would it make sense to add an option to guess the language on a per-buffer basis, rather than a per-paragraph basis? I mean, I love this package because I write in three different languages and it's annoying having to switch the ispell dictionary all the time. But I rarely if ever use different-language paragraphs in a single buffer. When I do mix languages, it's usually a few words or phrases in language B contained within a paragraph, in a text that is otherwise completely language A.

tmalsburg commented 7 years ago

Great find, @peterwvj. I pushed a version in which flyspell is instructed to use the strategy for large texts (bad037525ac28855614731505e43dec886c43266). So there's no need to set this variable globally. There is a little bit of a flicker left but it's much less intrusive than the moving cursor. Nice! Issue solved?

tmalsburg commented 7 years ago

But I rarely if ever use different-language paragraphs in a single buffer.

I'm a linguist and I do this all the time: main text in one language with example sentences from various other languages. But I'm also not sure what benefit we get from rechecking the full buffer instead of just the current paragraph. Could you elaborate?

peterwvj commented 7 years ago

Great @tmalsburg, thanks! I'll try it out once it's available on MELPA. And yes, I'm happy to consider this issue solved.

joostkremers commented 7 years ago

I'm a linguist and I do this all the time: main text in one language with example sentences from various other languages. But I'm also not sure what benefit we get from rechecking the full buffer instead of just the current paragraph. Could you elaborate?

The idea was that guess-language only guesses once what the language of the buffer is, either when a file is visited or (with new files) as soon as it contains enough characters to make a successful guess. Then once the language has been established, it remains set as the buffer-local language.

But, as you point out, detection is quick, and now that you found a workaround to deal with the cursor movement, it doesn't really seem necessary to create such an option.

tmalsburg commented 7 years ago

Ah, I see. The purpose of this variable would be primarily to save cycles not to change behavior. Well, you can have that already:

(add-hook 'text-mode-hook 'guess-language-autoset)

This sets the language once when you open a file. This only works as desired when the file already has some content. But I'm guessing something similar could be hacked to deal with new files as well. For example, you could activate guess-language-mode through text-mode-hook and then defadvice guess-language-autoset such that guess-language-mode is deactivated when it's run for the first time. This way it only runs once in a buffer.

hugoroy commented 2 years ago

Hi all Commenting here as I experience this issue of the cursor jumping around and causing strange glitches. I have tried to solve this by trying various flyspell configs, without success.

The Messages buffer is full of messages like the below, even when not much is going on:

Local Ispell dictionary set to en_GB
Detected language: English
Starting new Ispell process aspell with en_GB dictionary...done
Detected language: English [16 times]
Ispell process killed
Local Ispell dictionary set to fr_FR
Detected language: French
Starting new Ispell process aspell with fr_FR dictionary...done
Detected language: French [33 times]
Ispell process killed
Local Ispell dictionary set to en_GB
Detected language: English
Starting new Ispell process aspell with en_GB dictionary...done
Detected language: English [6 times]

An overview of the issue from the user experience can be seen here: https://storage.5apps.com/hugo/public/shares/210823-0834-Capture%20d%E2%80%99%C3%A9cran%20vid%C3%A9o%20de%2023-08-2021%2010%3A30%3A40.webm

It may seem minor here, but in the context of real use can be quite annoying.

tmalsburg commented 2 years ago

Hi, I agree that this is annoying but I doesn't happen on my system. Could you try to make a reproducible example for emacs -Q? Thank you.

tmalsburg commented 2 years ago

@hugoroy could you try installing guess-language from GitHub? There is a chance that the ELPA version is outdated and doesn't have the fix yet.

hugoroy commented 2 years ago

Hello @tmalsburg The issue persists with the version from Github. Trying to make a reproducible example but I'm not experienced enough with Emacs to get there quickly... I have a fairly new setup using Doom Emacs.