-
I am currently in the process of rewriting the >5 year old HTML parser that currently exists. The existing parser is a fork of [node-fast-html-parser](https://github.com/ashi009/node-fast-html-parser)…
-
#faleiToLeve
-
The HTML spec defines that some elements cannot be nested in some other elements (eg `` cannot go under ``).
The task is to find the complete list, use some judgment to decide if it makes sense for…
-
started long ago. different repo?
recursive crawling with status json files about recursive process of a folder and html download
needs proper text grab which doesnt create duplicates, filters nice…
-
Would be nice to have in the HTML parser the possibility to capture data using XPath
thx
-
Building on what has already been done with [compiler.py](src/fdom/compiler.py):
* Tag names and attribute keys/values can be arbitrary lists of chunks/thunks, much like `*args`. This allows for wr…
-
Då somliga användare kommer kunna redigera vissa sidor genom att indirekt skriva html i vår editor så finns det en eventuell risk att någon kan lyckas få in skadlig/irriterande kod. I dagsläget använd…
-
Hi,
just found out about this lib, and it seems very interesting. What I really like is the render-agnostic approach!
I have a couple of questions that I hope you can answer:
1. am I right in thinkin…
-
https://www.compart.com/en/unicode/U+2588 says this character is \█
Alas,
```
$ echo █|perl -C -MHTML::Entities -nwle 'print encode_entities $_;'
█
```
the module is only aware of…
-
HtmlAgilityPack is used for all HTML parsing. It would be nice to provide different implementations. For example, using Gumbo when performance is nessecary or using CsQuery when we want to deal with c…