issues
search
kovidgoyal
/
html5-parser
Fast C based HTML 5 parsing for python
Apache License 2.0
678
stars
33
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Support for streaming/accessing only the `<head>` and skipping `<body>` parsing
#31
jdb8
closed
8 months ago
1
add compatibility with libxml2 2.12
#30
eli-schwartz
closed
10 months ago
0
handle_in_foreign_content whatwg spec causes fragment parsing loop infinitely when a mathml integration point exists at key points in the fragment
#29
kevinhendricks
closed
1 year ago
2
setup.py assumes MSVC on Windows
#28
us88
opened
2 years ago
1
Break tag presentation
#27
tonal
closed
3 years ago
1
Empty page for incorrect page
#26
tonal
closed
3 years ago
1
Support for fragment parsing
#25
whalebot-helmsman
closed
3 years ago
0
lxml.html support
#24
whalebot-helmsman
closed
3 years ago
0
Parsed result as an HTML tree
#23
whalebot-helmsman
closed
3 years ago
3
docs: fix simple typo, resuing -> reusing
#22
timgates42
closed
3 years ago
0
test
#21
kovidgoyal
closed
4 years ago
0
bs4 4.8.0 causes html5-parser to break
#20
eli-schwartz
closed
5 years ago
4
Clean up doctype handling in Beautiful Soup
#19
Mr0grog
closed
5 years ago
7
Doctype gets mangled with treebuilder='soup' and return_root=False
#18
Mr0grog
closed
5 years ago
1
double free or corruption when parsing "<html><html />" in xhtml mode
#17
ivan
closed
5 years ago
0
request for clarification or setup.py rework
#16
romain-dartigues
closed
6 years ago
1
Fragment parsing
#15
xmo-odoo
closed
6 years ago
3
UnicodeDecodeError when parsing a (supposedly) UTF-7 encoded page
#14
VinDuv
closed
6 years ago
2
etree.tostring(method='html') unnecessarily escapes all non-ASCII characters
#13
zackw
closed
6 years ago
1
as-libxml.c: Set encoding of xmlDoc to UTF-8
#12
Balletie
closed
6 years ago
2
as_libxml.c: Set xmlDoc to be of "HTML" type to fix HTML-specific code in libxml2
#11
Balletie
closed
6 years ago
6
broken html if empty title
#10
thestick613
closed
6 years ago
2
Any options to use on AWS Lambda?
#9
franchb
closed
6 years ago
2
BeautifulSoup treebuilder not bringing in element classes properly?
#8
bradbeattie
closed
6 years ago
3
Non-ASCII contents are escaped when serializing with et.tostring(method='html')
#7
ciscorn
closed
6 years ago
5
'soup' treebuilder adds 'xmlns' prefix to 'xmlns' attribute on inline svg element
#6
jpark3000
closed
7 years ago
1
Sort input file list
#5
bmwiedemann
closed
7 years ago
0
Compile error
#4
stoecker
closed
7 years ago
3
PyPI: Fail to run tests due to missing file
#3
eli-schwartz
closed
7 years ago
0
Unresolved Externals when installing on windows
#2
WhoWouldaThunk
closed
4 years ago
2
Benchmarks comparing to other parsers
#1
alanhamlett
closed
7 years ago
9