taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.11k stars 107 forks source link

feat: void tag closing slash and space #207

Closed pyxide closed 2 years ago

pyxide commented 2 years ago

Hi,

I have added two options to format the serialization of void tags (e.g. br, link), the use case was comparing two states of a HTML document, before and after DOM transformations, it was generating noise in the diff.

The configuration can be set with environment variables :

export HTML_VOID_TAG_CLOSING_SLASH=1
export HTML_VOID_TAG_CLOSING_SPACE='always'

Then options if present will override them each time parse is called:

{
  voidTag: {
      closingSlash: true,     // void tag serialisation, add a final slash <br/>
      closingSpace : 'always' // space before the final slash : 'never', 'always', 'attrPresent'
  }                           // with attrPresent; <br/>, <meta charset="UTF-8" />
}

I don’t like to use global variables to set contextual data, but I cannot find an easy way to pass the options when doing serialisation. It should be propagated from the root element, or passed and dispatched when calling toString.

katherine11 commented 2 years ago

Hello @pyxide,

This fix is essential for a project we are working on. Thank you!

I hope the pull request is about to be published soon!

taoqf commented 2 years ago

Define a global variant may not be a good idea. I tried add this new features base on this pr. and new version v5.4.0 is released to npm. I checked issues not so often because our gov is blocking us. but I will try to keeps this lib going on though. Thank you all very much, and @nonara always.

pyxide commented 2 years ago

Hi @taoqf ,

thank you for the merging of this PR. I am not a fan of global variable either. I am on vacation right now, but I will take time to look at the code, and I will revisit this issue with a better solution based on test units. Have a nice day.