taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.12k stars 112 forks source link

Keep original last slash and space in tag #133

Closed sanak closed 3 years ago

sanak commented 3 years ago

Currently, tag last slash seems to be dropped as follows, but if possible, keeping it is quite helpful. [Before parse]

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
:
<br/>

[After parse/toString]

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" >
:
<br>

Also, keeping original space in tag is helpful. (Current result is as follows.) [Before parse]

<body >

[After parse/toString]

<body>
taoqf commented 3 years ago

It will not be easy to do so. especially keeping the original space. I am curious why this means something to you.

mackignacio commented 3 years ago

@taoqf I think we don't need to implement this because meta tags are void elements they are singleton. They don't need a closing tag as explained in here.

nonara commented 3 years ago

@sanak Thanks for the suggestion!

This library produces what is called an Abstract Syntax Tree. This essentially means that it gives us a general representation of the DOM structure. In such models, certain nuances such as exact whitespace within tags and or the self-closing slash is not usually preserved.

We may add a property to identify a tag which is self-closed in the future, but for now, if you need the exact preserved syntax for the node, I recommend using the node.range property, which will allow you to slice the exact node from the original source text.

Using that, you should be able to determine if there is a slash before the closing tag. Hope that helps!

sanak commented 3 years ago

@nonara Okay, thanks for the information.

@taoqf

I am curious why this means something to you.

Sorry for too late reply, but I wanted to minimize diff from the following PR changes. https://github.com/Siedlerchr/types-ol-ext/pull/33/files#diff-eeadd2a33c0319ff7602a0211481ac37eedee26a9038204e5dcb1ee9c839a6d4

Here is the example current diff which includes some diffs about last slash, but these diffs can be removed by @nonara's node.range suggestion.

--- a/examples/bar/map.control.bar.html
+++ b/examples/bar/map.control.bar.html
@@ -8,12 +8,12 @@

 ------------------------------------------------------------>
    <title>ol-ext: Control bar</title>
-   <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+   <meta http-equiv="Content-Type" content="text/html; charset=utf-8" >

-   <meta name="description" content="ol.control.Bar is a control bar that contains controls." />
-   <meta name="keywords" content="ol3, control, bar, panel, ol3, openlayers, interaction" />
+   <meta name="description" content="ol.control.Bar is a control bar that contains controls." >
+   <meta name="keywords" content="ol3, control, bar, panel, ol3, openlayers, interaction" >

-   <link rel="stylesheet" href="../style.css" />
+   <link rel="stylesheet" href="../style.css" >

    <!-- jQuery -->
    <script type="text/javascript" src="https://code.jquery.com/jquery-1.11.0.min.js"></script>
@@ -21,13 +21,13 @@
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">

    <!-- Openlayers -->
-    <link rel="stylesheet" href="https://openlayers.org/en/latest/css/ol.css" />
-   <script type="text/javascript" src="https://openlayers.org/en/latest/build/ol.js"></script>
+    <link rel="stylesheet" href="https://openlayers.org/en/latest/css/ol.css" >
+   
    <script src="https://cdn.polyfill.io/v2/polyfill.min.js?features=requestAnimationFrame,Element.prototype.classList,URL,Object.assign"></script>

    <!-- ol-ext -->
-    <link rel="stylesheet" href="../../dist/ol-ext.css" />
-  <script type="text/javascript" src="../../dist/ol-ext.js"></script>
+    <link rel="stylesheet" href="../../dist/ol-ext.css" >
+  
   <!-- Pointer events polyfill for old browsers, see https://caniuse.com/#feat=pointer -->
   <script src="https://unpkg.com/elm-pep"></script>

@@ -38,7 +38,7 @@
    </style>

 </head>
-<body >
+<body>
    <a href="https://github.com/Viglino/ol-ext" class="icss-github-corner"><i></i></a>

    <a href="../../index.html">
@@ -46,11 +46,11 @@
    </a>
    <div class="info">
        <i>ol.control.Bar</i> is a panel that contains other controls. Control bar can be nested.
-       <br/>
+       <br>
        You can choose the position for the control bar.
-       <br/>
+       <br>
        It can group <i>ol.control.Toggle</i> with a toggleOne propertie to have only one activated at a time. You can compose toolbars with it.  
-       <br/>
+       <br>
        A sub-bar can be nested to add options/controls visible when the parent <i>ol.control.Toggle</i> is active (see the <a href="map.control.subbar.html">sub-bar example</a>).
    </div>

@@ -73,73 +73,11 @@
                </select>
            </li>
        </ul>
-       Information:<br />
+       Information:<br>
        <textarea id="info" style="width:25em; height:10em"></textarea>
    </div>
    :