issues
search
adbar
/
htmldate
Fast and robust date extraction from web pages, with Python or on the command-line
https://htmldate.readthedocs.io
Apache License 2.0
117
stars
26
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
extractors: more precise regex components
#106
adbar
closed
8 months ago
1
extraction: better precision & shorter code
#105
adbar
closed
8 months ago
1
replace lxml.html.Cleaner
#104
adbar
closed
8 months ago
1
extractors: stricter patterns
#103
adbar
closed
8 months ago
1
prepare v1.5.2 and update setup
#102
adbar
closed
9 months ago
1
fix: None in try_date_expr()
#101
adbar
closed
9 months ago
1
fix: missing month keys
#100
adbar
closed
9 months ago
1
Feature: Add Portuguese month names
#99
danielbichuetti
closed
9 months ago
4
try_date_expr validation error
#98
arcombe012
closed
9 months ago
1
prepare v1.5.1
#97
adbar
closed
10 months ago
1
fix regression for fast extraction in e8b3538
#96
adbar
closed
10 months ago
1
setup fix: make backports-datetime-fromisoformat optional
#95
adbar
closed
10 months ago
1
build(deps): bump lxml from 4.9.2 to 4.9.3
#93
dependabot[bot]
closed
10 months ago
2
eval: extend and update
#92
adbar
closed
10 months ago
1
Error installing trafilatura on playwright focal image
#94
jaekunchoi
closed
10 months ago
2
Consider switching from lxml's clean_html for enhanced security (and possibly performance)
#91
frenzymadness
closed
8 months ago
1
docs: add RTD configuration file
#90
adbar
closed
10 months ago
0
prepare v1.5.0
#89
adbar
closed
10 months ago
1
use fromisoformat backport for Python versions < 3.11
#88
adbar
closed
10 months ago
1
extractors: better discard regexes
#87
adbar
closed
10 months ago
1
maintenance: simplify code structure
#86
adbar
closed
10 months ago
1
Sourcery refactored master branch
#85
sourcery-ai[bot]
closed
10 months ago
0
update setup & pin LXML for macOS
#84
adbar
closed
10 months ago
1
HTML parsing: adjust fallback heuristic
#83
adbar
closed
10 months ago
1
build(deps): bump goose3 from 3.1.12 to 3.1.17
#82
dependabot[bot]
closed
11 months ago
2
build(deps): bump goose3 from 3.1.12 to 3.1.16
#81
dependabot[bot]
closed
1 year ago
2
build(deps): bump goose3 from 3.1.12 to 3.1.15
#80
dependabot[bot]
closed
1 year ago
1
prepare v1.4.3
#79
adbar
closed
1 year ago
0
update urllib3 and setup
#78
adbar
closed
1 year ago
1
build(deps): bump goose3 from 3.1.12 to 3.1.14
#77
dependabot[bot]
closed
1 year ago
1
build(deps): bump news-please from 1.5.22 to 1.5.33
#76
dependabot[bot]
closed
1 year ago
1
build(deps): update urllib3 requirement from <2,>=1.26 to >=1.26,<3
#75
dependabot[bot]
closed
1 year ago
1
Support min_date/max_date as datetimes or datetime strings
#74
kernc
closed
1 year ago
8
Add date attributes to HTML extraction
#73
kernc
closed
1 year ago
3
build(deps): bump goose3 from 3.1.12 to 3.1.13
#72
dependabot[bot]
closed
1 year ago
1
Sourcery refactored master branch
#71
sourcery-ai[bot]
closed
1 year ago
1
LATEST_POSSIBLE max date can become outdated
#70
rolisz
closed
1 year ago
2
Sourcery refactored master branch
#69
sourcery-ai[bot]
closed
1 year ago
1
Add CodeQL workflow for GitHub code scanning
#68
lgtm-com[bot]
closed
1 year ago
0
`find_date` doesn't extract `%D %b %Y` formatted dates in free text
#67
k-sareen
closed
1 year ago
7
feature: supports delaying url date extraction
#66
getorca
closed
1 year ago
9
Compatibility with Python 3.11
#65
adbar
closed
1 year ago
0
build(deps): bump goose3 from 3.1.11 to 3.1.12
#64
dependabot[bot]
closed
1 year ago
2
Sourcery refactored master branch
#63
sourcery-ai[bot]
closed
1 year ago
2
Parsing fails for older dates
#62
adbar
closed
1 year ago
0
build(deps): bump htmldate from 1.2.1 to 1.3.0
#61
dependabot[bot]
closed
1 year ago
1
Sourcery refactored master branch
#60
sourcery-ai[bot]
closed
1 year ago
1
build(deps): bump htmldate from 1.2.1 to 1.2.3
#59
dependabot[bot]
closed
2 years ago
1
build(deps): bump tabulate from 0.8.9 to 0.8.10
#58
dependabot[bot]
closed
2 years ago
1
memory: handling of `lru_cache`
#57
adbar
closed
1 year ago
2
Previous
Next