dense-analysis / ale

Check syntax in Vim/Neovim asynchronously and fix files, with Language Server Protocol (LSP) support
BSD 2-Clause "Simplified" License
13.49k stars 1.43k forks source link

Not displaying lint errors from languagetool for Mandarin texts #2852

Open Aryailia opened 4 years ago

Aryailia commented 4 years ago

Information

VIM version NVIM v0.4.2 Build type: Release Operating System: Void Linux x64

What went wrong

I'm editing a document with Mandarin text and trying to check for grammar errors with 'languagetool'. When 'languagetool' reports an error with Mandarin text in it, I would guess Ale is not parsing the output properly.

When specifying 'zh' as the language nothing happens--but when specifying 'en-GB' as the language, it works as intended. Both 1) running 'languagetool' from the commandline and 2) running :ALELint and then checking :ALEInfo show that there should be errors whether the language is 'zh' or 'en-GB'.

Reproducing the bug

  1. Download '[https://languagetool.org/download/LanguageTool-4.7.zip](languagetool-4.7 desktop version)' and openjdk 1.8.0_202

  2. relevant lines in vimrc

    let b:ale_languagetool_executable = 'java'
    let b:ale_languagetool_options = '-jar ${XDG_DATA_HOME}/LanguageTool-4.7
    \/languagetool-commandline.jar -l zh'
    let g:ale_linters = {'markdown': ['languagetool']}
    let g:ale_linters_explicit = 1

    (Change ale_languagetool_options to -l en-GB to see Ale reporting lint errors properly)

  3. Sample text called 'c.md'

    
    一墙之隔的两个。较差的阿三翻阅

I love to write paeprs I would write one every day if I had the time.


4. Run :ALELint

### :ALEInfo

Current Filetype: markdown Available Linters: ['alex', 'languagetool', 'markdownlint', 'mdl', 'proselint', 'redpen', 'remark_lint', 'textlint', 'vale', 'writegood'] Linter Aliases: 'remark_lint' -> ['remark-lint'] 'writegood' -> ['write-good'] Enabled Linters: ['languagetool'] Suggested Fixers: 'prettier' - Apply prettier to a file. 'remove_trailing_lines' - Remove all blank lines at the end of a file. 'textlint' - Fix text files with textlint --fix 'trim_whitespace' - Remove all trailing whitespace characters at the end of every line. Linter Variables:

let g:ale_markdown_remark_lint_executable = 'remark' let g:ale_markdown_remark_lint_options = '' let g:ale_markdown_remark_lint_use_global = 0 Global Variables:

let g:ale_cache_executable_check_failures = v:null let g:ale_change_sign_column_color = 0 let g:ale_command_wrapper = '' let g:ale_completion_delay = 100 let g:ale_completion_enabled = 1 let g:ale_completion_max_suggestions = 50 let g:ale_echo_cursor = 1 let g:ale_echo_msg_error_str = 'Error' let g:ale_echo_msg_format = '%code: %%s' let g:ale_echo_msg_info_str = 'Info' let g:ale_echo_msg_warning_str = 'Warning' let g:ale_enabled = 1 let g:ale_fix_on_save = 0 let g:ale_fixers = {} let g:ale_history_enabled = 1 let g:ale_history_log_output = 1 let g:ale_keep_list_window_open = 0 let g:ale_lint_delay = 200 let g:ale_lint_on_enter = 0 let g:ale_lint_on_filetype_changed = 1 let g:ale_lint_on_insert_leave = 1 let g:ale_lint_on_save = 1 let g:ale_lint_on_text_changed = 'never' let g:ale_linter_aliases = {} let g:ale_linters = {'markdown': ['languagetool']} let g:ale_linters_explicit = 1 let g:ale_list_vertical = 0 let g:ale_list_window_size = 10 let g:ale_loclist_msg_format = '%code: %%s' let g:ale_lsp_root = {} let g:ale_max_buffer_history_size = 20 let g:ale_max_signs = -1 let g:ale_maximum_file_size = v:null let g:ale_open_list = 0 let g:ale_pattern_options = v:null let g:ale_pattern_options_enabled = v:null let g:ale_set_balloons = 0 let g:ale_set_highlights = 1 let g:ale_set_loclist = 1 let g:ale_set_quickfix = 0 let g:ale_set_signs = 1 let g:ale_sign_column_always = 0 let g:ale_sign_error = '✖' let g:ale_sign_info = 'ℹ' let g:ale_sign_offset = 1000000 let g:ale_sign_style_error = '✖' let g:ale_sign_style_warning = '⚠' let g:ale_sign_warning = '⚠' let g:ale_sign_highlight_linenrs = 0 let g:ale_statusline_format = v:null let g:ale_type_map = {} let g:ale_use_global_executables = v:null let g:ale_virtualtext_cursor = 1 let g:ale_warn_about_trailing_blank_lines = 1 let g:ale_warn_about_trailing_whitespace = 1 Command History:

(executable check - success) java (finished - exit code 0) ['/bin/bash', '-c', '''java'' -jar ${XDG_DATA_HOME}/LanguageTool-4.7/languagetool-commandline.jar -l zh ''/home/rai/Documents/test/c.md''']

<<>> 1.) Line 1, column 6, Rule ID: wa5[4] Message: 数词与名词之间一般应存在量词,可能缺少量词。 一墙之隔的两个。较差的阿三翻阅 I love to write paeprs I would writ... ^^

2.) Line 1, column 11, Rule ID: wb4[2] Message: 动词的修饰一般为‘形容词(副词)+地+动词’。您的意思是否是:差 '地' 翻阅

Suggestion: 地 一墙之隔的两个。较差的阿三翻阅 I love to write paeprs I would write on... ^
Time: 2034ms for 36 sentences (17.7 sentences/sec) <<>>

w0rp commented 4 years ago

Maybe the regular expression just needs to be updated.

bratekarate commented 4 years ago

This is actually not only the case for Mandarin. As far as I can tell from the source code, suggestions are not included at all. I tried implementing it because I needed the suggestions, but then the code broke unexpectedly in some cases. Apparently some errors do not have suggestions, so the current design of the script can not work. Hence the comment in the code :)

    " We just check that the arrays are same sized and merge everything
    " together

Probably it has to be reworked with JSON responses, because the folks from LanguageTool are apparently not that much into consistent single line responses...

I could share my temporary solution, but it's somewhat even more confusing if suggestions are sometimes there, but disappear as soon as one error does not include a suggestion.

EDIT: @Aryailia you can use this patch to include suggestions. Will work in most cases with English, I don't know how good suggestions are for Mandarin. As mentioned before, it will print no suggestions at all if any of the errors does not contain suggestions. In that case, you can use :ALEInfo to see the suggestions and fix the error without suggestions. After you fix that error, you will see suggestions again. Hope this helps.

Here is the patch:

diff --git a/autoload/ale/handlers/languagetool.vim b/autoload/ale/handlers/languagetool.vim
index 73974ce..d473d04 100644
--- a/autoload/ale/handlers/languagetool.vim
+++ b/autoload/ale/handlers/languagetool.vim
@@ -27,6 +27,11 @@ function! ale#handlers#languagetool#HandleOutput(buffer, lines) abort
     let l:message_pattern = '^\vMessage. (.+)$'
     let l:message_matches = ale#util#GetMatches(a:lines, l:message_pattern)

+    " Match lines like:
+    " Suggestion: Suggestion 1; Suggestion 2; Suggestion 3; ...
+    let l:suggest_pattern = '^\v(Suggestion. .+)$'
+    let l:suggest_matches = ale#util#GetMatches(a:lines, l:suggest_pattern)
+
     " Match lines like:
     "   ^^^^^ "
     let l:markers_pattern = '^\v *(\^+) *$'
@@ -45,13 +50,19 @@ function! ale#handlers#languagetool#HandleOutput(buffer, lines) abort
     \       (len(l:head_matches) == len(l:markers_matches))
     \       && (len(l:head_matches) == len(l:message_matches))
     \   )
+        let l:text = l:message_matches[l:i][1]
+
+        if (len(l:head_matches) == len(l:suggest_matches))
+            let l:text .= ' ' . l:suggest_matches[l:i][1]
+        endif
+
         let l:item = {
         \   'lnum'    : str2nr(l:head_matches[l:i][1]),
         \   'col'     : str2nr(l:head_matches[l:i][2]),
         \   'end_col' : str2nr(l:head_matches[l:i][2]) + len(l:markers_matches[l:i][1])-1,
         \   'type'    : 'W',
         \   'code'    : l:head_matches[l:i][3],
-        \   'text'    : l:message_matches[l:i][1]
+        \   'text'    : l:text
         \}
         call add(l:output, l:item)
         let l:i+=1

@w0rp Are the missing suggestion a known issue? Should I create another Issue for all languages in general?