vim / vim

The official Vim repository
https://www.vim.org
Vim License
36.03k stars 5.38k forks source link

Fix HTML indent lasttick update #3673

Open muhmuhten opened 5 years ago

muhmuhten commented 5 years ago

Since 7.4.345 (2014-06-25), when the check was introduced, the html indent checks and sets b:hi_lasttick inconsistently:

   993   if prevnonblank(v:lnum - 1) == b:hi_indent.lnum && b:hi_lasttick == b:changedtick - 1
  1062   let b:hi_lasttick = b:changedtick

This results in always recomputing a fresh state when the buffer has not changed, and very rarely may attempt to reuse a state after a buffer change. The only case where this is likely to be correct is when indenting the line immediately following a line whose indentation was just changed by the indenter, and can mistakenly reuse state when exactly one unrelated change has occurred in the meantime.

Always using a fresh state almost always gets the right indentation, but has noticeably awful O(n^2) performance when reindenting the entirety of HTML files with many lines.

The fix is trivial. However, I have not checked that the state update in HtmlIndent() is correct and matches the logic in FreshState(); it seems fine on casual inspection, but there may be bugs lurking there that haven't had a chance to manifest in the past four years because state reuse occurs so rarely.

diff --git a/indent/html.vim b/indent/html.vim
index 6c86659..ed1373e 100644
--- a/indent/html.vim
+++ b/indent/html.vim
@@ -232,15 +232,15 @@ call s:AddITags(s:indent_tags, [

 " New HTML5 elements:
 call s:AddITags(s:indent_tags, [
-    \ 'area', 'article', 'aside', 'audio', 'bdi', 'canvas',
-    \ 'command', 'data', 'datalist', 'details', 'embed', 'figcaption',
-    \ 'figure', 'footer', 'header', 'keygen', 'main', 'mark', 'meter',
-    \ 'nav', 'output', 'picture', 'progress', 'rp', 'rt', 'ruby', 'section',
-    \ 'summary', 'svg', 'time', 'track', 'video', 'wbr'])
+    \ 'article', 'aside', 'audio', 'bdi', 'canvas', 'command', 'data',
+    \ 'datalist', 'details', 'dialog', 'embed', 'figcaption', 'figure',
+    \ 'footer', 'header', 'hgroup', 'main', 'mark', 'meter', 'nav', 'output',
+    \ 'picture', 'progress', 'ruby', 'section', 'summary', 'svg', 'time',
+    \ 'video'])

 " Tags added for web components:
 call s:AddITags(s:indent_tags, [
-    \ 'content', 'shadow', 'template'])
+    \ 'slot', 'template'])
 "}}}

 " Add Block Tags: these contain alien content
@@ -1059,7 +1059,7 @@ func! HtmlIndent()
     endif
   endif

-  let b:hi_lasttick = b:changedtick
+  let b:hi_lasttick = b:changedtick - 1
   call extend(b:hi_indent, b:hi_newstate, "force")
   return indent
 endfunc "}}}
muhmuhten commented 5 years ago

An alternative would be to change the check instead of making b:hi_lasttick misleadingly named, which is more in line with the example in the documentation for b:changedtick.

My rationale for making the change this way is motivated by backward compatibility:

setl inde=FixedHtmlIndent()
func! FixedHtmlIndent()
        let indent = HtmlIndent()
        let b:hi_lasttick = b:changedtick - 1
        return indent
endfunc

Placed in ~/.vim/after, this snippet monkey-patches the current version of the code to reuse state appropriately. Changing the assignment to b:hi_lasttick merely makes this code redundant, without breaking anything. It is comparatively difficult to write a script which will similarly smooth over the transition to b:hi_lasttick == b:changedtick.

muhmuhten commented 5 years ago

There appears to be an inconsistency such that when the previous line

then the freshly computed state's baseindent is based on the closing tag, while HtmlIndent's incremental update computes baseindent based on the tags being opened. This leads to a behaviour difference between the two modes of operation. A small example:


<p>
https://thesaurus.weblio.jp/content/%E6%B0%97<wbr>%E3%81%8C<wbr>%E5%A4%A7<wbr>%E3%81%8D<wbr>%E3%81%8F<wbr>%E3%81%AA<wbr>%E3%82%8B</p>
                                                This line gets indented inconsistently.```

Notice the element in the example, `<wbr>`, is one of the HTML 5 tags which is mistakenly listed as a paired tag, but is in fact a void element. This edge case occurs only when tags expected to close are not closed, which can only correctly occur for self-closing tags.

I have amended the patch above to remove the following self-closing HTML 5 tags from the list of indented tags: area, keygen, rp, rt, track, wbr.

In addition, the following tags *may* be self-closing under certain conditions: body, dd, dt, head, html, li, optgroup, option, td, tfoot, th, thead, tr. However, all of these are likely to contain contents which it may be desirable to indent.
brammool commented 5 years ago

Since 7.4.345 (2014-06-25), when the check was introduced, the html indent checks and sets b:hi_lasttick inconsistently:

   993   if prevnonblank(v:lnum - 1) == b:hi_indent.lnum && b:hi_lasttick == b:changedtick - 1
  1062   let b:hi_lasttick = b:changedtick

This results in always recomputing a fresh state when the buffer has not changed, and very rarely may attempt to reuse a state after a buffer change. The only case where this is likely to be correct is when indenting the line immediately following a line whose indentation was just changed by the indenter, and can mistakenly reuse state when exactly one unrelated change has occurred in the meantime.

Always using a fresh state almost always gets the right indentation, but has noticeably awful O(n^2) performance when reindenting the entirety of HTML files with many lines.

The fix is trivial. However, I have not checked that the state update in HtmlIndent() is correct and matches the logic in FreshState(); it seems fine on casual inspection, but there may be bugs lurking there that haven't had a chance to manifest in the past four years because state reuse occurs so rarely.

diff --git a/runtime/indent/html.vim b/runtime/indent/html.vim
index bece6614b..03bf207b9 100644
--- a/runtime/indent/html.vim
+++ b/runtime/indent/html.vim
@@ -1059,7 +1059,7 @@ func! HtmlIndent()
     endif
   endif

-  let b:hi_lasttick = b:changedtick
+  let b:hi_lasttick = b:changedtick - 1
   call extend(b:hi_indent, b:hi_newstate, "force")
   return indent
 endfunc "}}}

I started looking into this and improving a few things, but quickly got lost in the undocumented parts. I'll have to get back to this when I have more time.

I'll include the parts that should be OK.

-- ARTHUR: What are you going to do. bleed on me? "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

/// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net \\ /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\ \\ an exciting new programming language -- http://www.Zimbu.org /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///