textmate / html.tmbundle

TextMate support for HTML
76 stars 95 forks source link

html grammar should allow space characters (including line feed) before the > in end tags #97

Open alexr00 opened 5 years ago

alexr00 commented 5 years ago

From @joshunger on December 9, 2018 22:13

Issue Type: Bug

Syntax highlighting is white and incorrect for html

image

Minimal example:

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>foo</title>
  </head>
  <body>
    <div></div>
    <script src="http://dummy"></script
    ><script src="http://dummy"></script
    ><!--[if lte IE 9]><script src="http://dummy"></script><![endif]-->
  </body>
</html>

An extension formatted it but it appears the syntax highlighting is a code bug?

VS Code version: Code - Insiders 1.30.0-insider (85f805acf9436381b878cab5ab6c5146beec7893, 2018-12-05T12:13:44.894Z) OS version: Darwin x64 18.2.0

System Info |Item|Value| |---|---| |CPUs|Intel(R) Core(TM) i5-7267U CPU @ 3.10GHz (4 x 3100)| |GPU Status|2d_canvas: enabled
checker_imaging: disabled_off
flash_3d: enabled
flash_stage3d: enabled
flash_stage3d_baseline: enabled
gpu_compositing: enabled
multiple_raster_threads: enabled_on
native_gpu_memory_buffers: enabled
rasterization: enabled
video_decode: enabled
video_encode: enabled
webgl: enabled
webgl2: enabled| |Load (avg)|3, 3, 3| |Memory (System)|16.00GB (0.14GB free)| |Process Argv|.| |Screen Reader|no| |VM|0%|
Extensions (18) Extension|Author (truncated)|Version ---|---|--- vscode-eslint|dba|1.7.0 xml|Dot|2.3.2 EditorConfig|Edi|0.12.5 prettier-vscode|esb|1.7.2 flow-for-vscode|flo|0.8.5 python|ms-|2018.11.0 vscode-jest|Ort|2.9.2 vscode-docker|Pet|0.4.0 java|red|0.35.0 sass-indented|rob|1.4.9 vscode-fileutils|sle|2.13.3 code-spell-checker|str|1.6.10 sort-lines|Tyr|1.7.0 vscode-java-debug|vsc|0.15.0 vscode-java-dependency|vsc|0.2.0 vscode-java-pack|vsc|0.5.0 vscode-java-test|vsc|0.11.1 vscode-maven|vsc|0.11.3

Copied from original issue: Microsoft/vscode#64699

alexr00 commented 5 years ago

The html grammar we use doesn't handle the greater than symbol being on the next line like this:

</script
><script src="http://dummy"></script
><!--[if lte IE 9]><script src="http://dummy"></script><![endif]-->

Is this something that you expect to be ok? It isn't a standard way of formatting html as far as I know.

alexr00 commented 5 years ago

From @joshunger on December 10, 2018 12:27

I was wondering that too.

It seems to validate at https://validator.w3.org/.

Is this the correct spec? See https://www.w3.org/TR/html5/syntax.html#end-tags

End tags must have the following format:

  1. The first character of an end tag must be a U+003C LESS-THAN SIGN character (<).
  2. The second character of an end tag must be a U+002F SOLIDUS character (/).
  3. The next few characters of an end tag must be the element’s tag name.
  4. After the tag name, there may be one or more space characters.
  5. Finally, end tags must be closed by a U+003E GREATER-THAN SIGN character (>).

And https://www.w3.org/TR/html5/infrastructure.html#space-characters says -

The space characters, for the purposes of this specification, are U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), and U+000D CARRIAGE RETURN (CR).

:thinking:

What does Code use for html syntax?

alexr00 commented 5 years ago

The html formatter is here, but I don't know all the details. The html grammar we use for syntax highlighting is here: https://github.com/textmate/html.tmbundle.

You mentioned that you're using an extension to provide formatting: the build in VS Code html formatting fixed those strange end tags for me. Based on the spec you linked they're allowed, they just aren't what I expected to see.

Since the issue is with the grammar, I can forward this issue to https://github.com/textmate/html.tmbundle.