taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.11k stars 107 forks source link

Set options on HTMLParser for global use #93

Closed DasdWaller closed 3 years ago

DasdWaller commented 3 years ago

Thanks for the package is very helpful for me. I try to parse some html but html is separate.

const options = {
  lowerCaseTagName: false,
  comment: false,
  blockTextElements: {
    script: true,
    noscript: true,
    style: true,
    pre: true
  }
};
var parse= HTMLParse()
var html1 = parse.parse(html1, options);
var html2 = parse.parse(html2, options);
var html3 = parse.parse(html3, options);
var html4 = parse.parse(html4, options);
var html5 = parse.parse(html5, options);
var html6 = parse.parse(html6, options);
...

I expect something like:


var parse= HTMLParse(options)
var html1 = parse.parse(html1);
var html2 = parse.parse(html2);
var html3 = parse.parse(html3);
var html4 = parse.parse(html4);
var html5 = parse.parse(html5);
var html6 = parse.parse(html6);
...
taoqf commented 3 years ago

That will be very easy to do that:

import parse, { Options } from 'node-html-parser';

function createYourParseFun(optoins: Options){
  return (html: string) =>{
    return parse(html, options);
  };
}

var yourparsefun = createYourParseFun(options);

var html1 = yourparsefun(html1);
var html2 = yourparsefun(html2);
var html3 = yourparsefun(html3);
var html4 = yourparsefun(html4);
var html5 = yourparsefun(html5);
var html6 = yourparsefun(html6);

or you can make this a class

import parse, { Options } from 'node-html-parser';

class YourParser {
  constructor(private options: Options) { }
  public parse(html: string) {
    return parse(html, this.options);
  }
}

var parser = new YourParser(options);

var html1 = parser.parse(html1);
var html2 = parser.parse(html2);
var html3 = parser.parse(html3);
var html4 = parser.parse(html4);
var html5 = parser.parse(html5);
var html6 = parser.parse(html6);

Both way will work.

Note I just export a parse function that is without state. This is very simple: you parsed params in, the function give you the result, no matter how many times.

Wish you good luck, and, if you have any idea about this, please let me know.

taoqf commented 3 years ago

closing as no response.