issues
search
scrapinghub
/
extruct
Extract embedded metadata from HTML markup
BSD 3-Clause "New" or "Revised" License
846
stars
113
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add a minimal action I/O implementation to improve SearchAction support
#237
Gallaecio
opened
3 months ago
0
Fix the Build Status badge
#236
wRAR
opened
5 months ago
0
Release notes for 0.17.0.
#235
wRAR
closed
5 months ago
0
Replace lxml[html-clean] with lxml + lxml-html-clean, fix the deps in setup.py
#234
wRAR
closed
5 months ago
0
Update tool versions
#233
wRAR
closed
5 months ago
0
Installing with lxml-5.2.1 ImportError: cannot import name '_ElementStringResult' from 'lxml.etree'
#232
svoss
closed
5 months ago
3
Latest release on PyPi (0.16.0) breaks with lxml>5.1.0: import extruct throws ImportError: cannot import name '_ElementStringResult'
#231
mfhepp
closed
6 months ago
1
cannot import name '_ElementStringResult' from 'lxml.etree
#230
williambarberjr
closed
6 months ago
2
Add dependabot configuration for GitHub Actions
#229
FriedrichFroebel
closed
5 months ago
1
feat: Add dependabot for github actions
#228
Rotzbua
opened
6 months ago
3
chore: remove python 2 import
#227
Rotzbua
closed
6 months ago
0
feat: add `pyupgrade` to `pre-commit`
#226
Rotzbua
closed
5 months ago
0
feat: use `yield from` syntax
#225
Rotzbua
closed
6 months ago
0
fix: remove compatibility imports
#224
Rotzbua
closed
6 months ago
0
fix: remove py2 utf8 encoding
#223
Rotzbua
closed
6 months ago
0
feat: remove dependency `six`
#222
Rotzbua
closed
6 months ago
0
chore: Remove Python 2 specific code
#221
Rotzbua
closed
5 months ago
0
fix: typo
#220
Rotzbua
closed
6 months ago
1
fix: remove python 3.7
#219
Rotzbua
closed
6 months ago
1
feat: add python 3.12
#218
Rotzbua
closed
6 months ago
0
fix: Add Support for lxml >= 5.2.0
#217
michael-genson
closed
6 months ago
2
ImportError: cannot import name '_ElementStringResult' from 'lxml.etree'
#216
tonal
closed
6 months ago
1
Package breaking due to change in lxml
#215
marcosfelt
closed
6 months ago
2
Fix SyntaxWarning in #213
#214
marillat
closed
6 months ago
1
SyntaxWarning invalid escape sequence '\s'
#213
marillat
closed
6 months ago
0
# Unicode exception bodge/fix
#212
dconnx
opened
9 months ago
0
Unable to get meta tag value from inside body
#211
samibelal0
opened
12 months ago
0
Selectolax benchmarks
#210
westurner
opened
1 year ago
0
Consider switching from lxml's clean_html for enhanced security (and possibly performance)
#209
frenzymadness
opened
1 year ago
7
" in application/ld+json gives exception
#208
bodanius
opened
1 year ago
0
lxml.etree.ParserError: Document is empty
#207
lironesamoun
opened
1 year ago
5
Allow receiving an external tree, drop python 3.7
#206
croqaz
closed
1 year ago
4
Bump isort to v5.12.0
#205
serhii73
closed
1 year ago
0
Should not Depends on python3 (<< 3.7)
#204
marillat
opened
1 year ago
6
add mypy typing
#203
sbdchd
closed
1 year ago
2
create .git-blame-ignore-revs to contain commits due to black and isort
#202
BurnzZ
closed
1 year ago
0
add `pre-commit` setup with black and sort
#201
sbdchd
closed
1 year ago
2
cut old python versions from ci
#200
sbdchd
closed
2 years ago
0
[suggestion] adding type hints?
#199
sbdchd
opened
2 years ago
7
error extracting json-ld for validated json
#198
rmizrahigit
opened
2 years ago
0
changed the opengraph meta data extraction to incorporate the html body.
#197
frostrot
opened
2 years ago
1
Added twitter card functionality
#196
blackhat-7
opened
2 years ago
1
Solves issue #171
#195
AbhinavSE
opened
2 years ago
1
LD+JSON outside HTML element
#194
bar24
opened
2 years ago
1
Very slow extraction for specific string
#193
Schwankenson
opened
2 years ago
6
Some websites put meta tags outside the head.
#192
paul-rchds
opened
2 years ago
2
Extruct not matching up with Schema.org structured data testing tool (Incorrect image Urls)
#191
dconnx
opened
2 years ago
3
Corrected typo in dublincore.py
#190
susca
closed
2 years ago
1
Ignore invalid jsonld elements on the page source.
#189
naveen17797
opened
2 years ago
3
Removed rdflib-jsonld as a dependency (from GH-182)
#188
lopuhin
closed
2 years ago
1
Next