Closed petRUShka closed 4 years ago
Thanks for reporting this. I'll have a look at it.
It seems that you can simply add to vimrc
let g:vimtex_grammar_vlty.shell_options = '--encoding cp866'
With your minimal.tex, I then only get a message from \selectlanguage{russian}
, since this macro is still missing in yalafi. So, the filter just includes 'russian', and LanguageTool complains about the sentence starting lower-case.
Furthermore, if I add two full stops at the end of the Russian sentence, then this is also correctly reported by LanguageTool.
Unfortunately, I cannot fully test your example, as my terminal is UTF-8 encoded, so that Vim does not correctly display the Russian sentence in the converted file test_cp866.tex.
Remark: If possible, add a country code to spelllang, e.g., ru_RU.
EDIT: If the first solution from above does work, we can decide if we read Vim's variable fileencoding and pass it to yalafi by default.
Thanks for reporting this. I'll have a look at it.
And thanks from me for responding!
My gut feeling is that it might not be so easy to directly translate between Vim settings for the encoding and the desired input for the --encoding
option of Yalafi. But you're the expert here, so I think I'll let you be the judge of that. In any case, it does not seem so bad to add the encoding
key as a possibility for the vlty
config dictionary.
@matze-dd with shell_option
workaround it is working fine but with some warnings:
Expected text language: Russian
Working on STDIN...
=== test.tex ===
1.) Line 6, column 17, Rule ID: UPPERCASE_SENTENCE_START
Message: Это предложение не начинается с заглавной буквы
Suggestion: Russian
russian Случайный текст на русском
^^^^^^^
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.inputenc'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.babel'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.inputenc'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.babel'
But having this option for whole vim installation is good but still it is a huge inconvenience in general because some files are in UTF-8 encoding and some files are in different ones.
Remark: If possible, add a country code to spelllang, e.g., ru_RU.
You mean set spellang = ru_RU
?
@petRUShka
@matze-dd with shell_option workaround it is working fine but with some warnings ...
Great. I will add the commonly used packages babel and inputenc to the TODO list of yalafi. After fixing, this will avoid the warnings and the problem with \selectlanguage{russian}.
Assume you have a mistake in the Russian sentence, for instance Случайный текст на руусском
, and the cursor is on the corresponding message line in the quickfix window. If you hit Enter, does the cursor now jump to the beginning of руусском
in the text window? I'm asking that because - at least in my system - there seems to be a problem with cursor location on lines containing non-ASCII characters (for instance, UTF-8 encoded German "umlauts" or left and right double quotes). In the quickfix window message, the correct (virtual) column number of the problem is given, but jumping there sometimes does not work properly.
But having this option for whole vim installation is good but still it is a huge inconvenience in general because some files are in UTF-8 encoding and some files are in different ones.
I understand. We will have to check whether reading the fileencoding variable works. Looking at @lervag's comments above, we might encounter problems. I will try to sort that out.
You mean set spellang = ru_RU?
Yes, at least for English texts, spell check is only done by LanguageTool, if a country code is given. But perhaps, this is not necessary for Russian.
In your minimal.tex, you write ... set spelllang=ru,en ...
. Please be aware that vlty currently only recognises the first language specification from variable spelllang.
A last question. As mentioned above, my Vim does not correctly display the Russian sentence in file test_cp866.tex. Do you know what to do in order to fix that? My default encoding (terminal window and so on) is UTF-8. On :set fileencoding
, I get fileencoding=latin1
. This probably means that Vim cannot figure out the encoding, and I cannot find a command-line option to specify it. Modifying fileencoding in the opened file does not help, either.
@lervag
And thanks from me for responding!
As this is your project, I hope it wasn't impolite.
My gut feeling is that it might not be so easy to directly translate between Vim settings for the encoding and the desired input for the --encoding option of Yalafi.
I'll try to have a look. To get it work, at least fileencoding
should be local to the buffer, and the code names should coincide between Vim and Python.
... it does not seem so bad to add the encoding key as a possibility for the vlty config dictionary.
Agreed. But I also understand the wish of @petRUShka.
@matze-dd, @lervag thanks for your work!
I'm using following vim-code to autodetect encoding of file:
if has("multi_byte")
if &termencoding == ""
let &termencoding = &encoding
endif
set encoding=utf-8
setglobal fileencoding=utf-8 bomb
set fileencodings=utf-8,cp1251,cp866,koi8-r,latin1
set fileformats=unix,mac,dos
endif
In case of cp866 files the fileencoding is automatically set to fileencoding=cp866
.
Positioning is bad in Cyrillic case. So in case of English sentence it is at the right place and in case of Russian sentence it seems to be on half way to right place (26 instead of 51 by vim count).
@petRUShka Thank you! With the following version, I now can correctly see the Russian sentence, coded in cp866:
...
set fileencodings=utf-8,cp866,koi8-r,latin1
...
The positioning problem seems to be a general one:
@lervag (this might well be a separate Issue) Did you already see this positioning problem somewhere? I have a similar problem with German 'umlauts', for instance with the (English UTF-8) text
This is ä testx.
Here, the ä
is ignored by LanguageTool, and the mistake testx
is correctly reported at (virtual) column 11 in the quickfix window. Hitting Enter, however, moves the cursor to column 10 (between ä
and testx
) in the text window. In errorformat
the column number has been passed as virtual column number with %v
, but somehow this is not respected when jumping from the quickfix to the text window. The number now is apparently taken as (byte-oriented) column.
In the past, I have already browsed a bit the Web, but could not find any related report. For texts where almost all characters are non-ASCII, as in the ones from @petRUShka, this becomes quite visible. Column numbers and virtual column numbers almost never are the same.
If we cannot find a solution within Vim, I could rather simply add an output format version to yalafi.shell that reports the (byte oriented) column number. We could read it with %c
into the errorformat
, and then jumping to the error should be correct.
And thanks from me for responding!
As this is your project, I hope it wasn't impolite.
No, I'm just glad to have more people pitch in and help! :)
... it does not seem so bad to add the encoding key as a possibility for the vlty config dictionary.
Agreed. But I also understand the wish of @petRUShka.
Yes, I agree, this is a good point. I don't see immediately how to do this, though, except perhaps to somehow parse the \selectlanguage{...}
command or options to the \usepackage{babel}
to set the language for a given project.
Here, the ä is ignored by LanguageTool, and the mistake testx is correctly reported at (virtual) column 11 in the quickfix window. Hitting Enter, however, moves the cursor to column 10 (between ä and testx) in the text window. In errorformat the column number has been passed as virtual column number with %v, but somehow this is not respected when jumping from the quickfix to the text window. The number now is apparently taken as (byte-oriented) column.
Ah, I think %v
is not what you think. Or, well, at least no fully. If you add a tab, the %v
will work as intended. So, %v
is not for handling multibyte characters, but for handling characters that are printed in multiple columns. I.e., sort of the opposite of what you want here.
Yes, I agree, this is a good point. I don't see immediately how to do this, though, except perhaps to somehow parse the \selectlanguage{...} command or options to the \usepackage{babel} to set the language for a given project.
If I understand @petRUShka right, only the file encoding should be detected automatically. It seems that variable fileencoding
does give the right information. For instance, assume you open two files in a split window, one is UTF-8 encoded, and the other one Latin-1. Then saying
:echo &l:fileencoding
gives the correct result for both files, depending on the cursor position.
My proposal would be to add a key encoding
to the vlty config dictionary, with default value 'utf-8'
. The user can set it to another specification, and to '*'
(or 'auto'
?) for using the encoding indicated by variable fileencoding
. I could propose a PR.
Ah, I think %v is not what you think. Or, well, at least no fully. If you add a tab, the %v will work as intended. So, %v is not for handling multibyte characters, but for handling characters that are printed in multiple columns. I.e., sort of the opposite of what you want here.
Looking at the Vim repository, and for instance here, it appears to me that this has been extended to multi-byte characters. In fact, it does work, if the message scanned by errorformat
is a single-line message (this is also what they check in their tests). I'll open a new Issue, as this is a side-issue to the encoding problem.
@matze-dd, I see it exactly as you described. But may be it would be better to have auto
by default. Why anyone should have utf-8
if he has opened file in different encoding and vim knows about it?
If I understand @petRUShka right, only the file encoding should be detected automatically. It seems that variable
fileencoding
does give the right information.
Yes, sorry, I was not thinking straight.
My proposal would be to add a key
encoding
to the vlty config dictionary, with default value'utf-8'
. The user can set it to another specification, and to'*'
(or'auto'
?) for using the encoding indicated by variablefileencoding
. I could propose a PR.
I think this sounds reasonable. I think the "auto" setting will be useful here, and as @petRUShka I think it might be the sensible default.
I'll be very happy to comment on a PR for this!
Yes, sorry, I was not thinking straight.
No problem. This is often my habit, too :) But sometimes, one unexpectedly has a good idea when thinking weirdly.
I think this sounds reasonable. I think the "auto" setting will be useful here, and as @petRUShka I think it might be the sensible default. I'll be very happy to comment on a PR for this!
OK, I'll try it next week, ребята!
Haha, yes, I agree some ideas come from "curved" thinking :)
Sounds good, looking forward to seeing a PR!
@matze-dd has pushed a PR that I just merged. Does this solve your issue, @petRUShka?
On my config works great:
:!python3 -m yalafi.shell --lt-command languagetool --encoding cp866 --language ru --disable "WHITESPACE_RULE" --enable "" --disablecategories "" --enablecateg
ories "" --documentclass "" --packages "amsbsy,keyval,fontenc,enumerate,eufrak,calc,inputenc,url,amsopn,babel,bm,amssymb,epstopdf-base,trig,amsmath,revsymb,ams
thm,array,longtable,caption2,natbib,amstext,mathtext,graphics,amsgen,graphicx,ifthen,caption3,extsizes,amsfonts" 'file_cp866.tex' 2>&1
| tee /tmp/vz3DWXB/15
P.S. Only vlty config:
let g:vimtex_grammar_vlty = {'lt_command': 'languagetool'}
Great, I'm happy to hear it! Thanks to @matze-dd for implementing the feature!
Issue Suppose we have tex-file in non utf-8 encoding and want to check grammar. vimtex together with vlty don't respect fileencoding so I get an error:
minimal.vim
minimal.tex
don't forget to convert file to cp866 encoding
Commands/Input Setup vim and
text_cp866.tex
like above. Run vim on test_cp866.tex and execute:Observed Behaviour
Expected Behaviour
But yalafi.shell has an option
--encoding
which works well. So it is sufficient to check onfileencoding
vim setting and send its value to yalafi via--encoding
option. And\or acceptOutput from VimtexInfo