issues
search
alan-turing-institute
/
ReadabiliPy
A simple HTML content extractor in Python. Can be run as a wrapper for Mozilla's Readability.js package or in pure-python mode.
MIT License
216
stars
35
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
fix issue-109 (use unique temp files for input/output of ExtractArticle.js)
#110
erpic
opened
1 week ago
1
simple_json_from_html_string(...) with use_readability=True is not multi-thread/process safe
#109
erpic
opened
1 week ago
4
node.js javascript runtime
#108
cmdcam
opened
3 months ago
1
pypi library v0.2.0 is not the latest, which have thread-safe bug for tempfile usage. maybe package a latest one?
#107
SunLnx
opened
5 months ago
0
Pass the url option to the JSDOM constructor to get images and relative links fixed
#106
facundoolano
opened
1 year ago
1
Issue#100 Fixed UnicodeEncodeError: 'charmap' codec can't encode char
#105
hanzalajamash
closed
1 year ago
0
Set up linting on GHA and fix existing linter issues
#104
nelson-liu
closed
1 year ago
0
Move away from deprecated `setup.py install`, setup GHA
#103
nelson-liu
closed
1 year ago
9
Purpose of Node.js
#102
swetepete
opened
2 years ago
0
How to update newest Readability.js of Mozilla?
#101
ducnguyenphanhoai
closed
1 year ago
1
Error: UnicodeEncodeError: 'charmap' codec can't encode character '\u2010' in position 164211: character maps to <undefined>
#100
ducnguyenphanhoai
opened
2 years ago
8
Quiet execution of ExtractArticle.js
#99
lodrantl
closed
1 week ago
1
thread & process safe text extraction from html string
#98
InzamamAnwar
closed
1 year ago
0
ReadabiliPy from multiple threads
#97
econaxis
closed
1 year ago
3
Extra entries with full text in plain_text list
#96
malicialab
opened
3 years ago
1
Feature: Import readability.js from npm
#95
GjjvdBurg
closed
1 year ago
1
Solves bug regarding change of working directory
#94
giovannigarifo
closed
1 year ago
1
How to allow extracting YouTube videos or <iframe> tags?
#93
cayolblake
opened
3 years ago
5
Bug in extracted images sources returning a base64
#92
cayolblake
opened
3 years ago
13
Improve the check for Node
#91
GjjvdBurg
closed
3 years ago
8
Fix 0% output from coveralls
#90
jemrobinson
closed
3 years ago
1
Fix coverage test
#89
jemrobinson
closed
3 years ago
1
Improve/generate documentation
#88
jemrobinson
opened
3 years ago
0
Bump lodash from 4.17.15 to 4.17.19
#87
dependabot[bot]
closed
3 years ago
0
Packaging
#86
GjjvdBurg
closed
3 years ago
2
Bump minimist from 1.2.0 to 1.2.3
#85
dependabot[bot]
closed
3 years ago
0
Bump acorn from 6.0.4 to 6.4.1
#84
dependabot[bot]
closed
3 years ago
0
Nested blocks break parser
#83
jemrobinson
closed
5 years ago
0
Deal with nested blocks
#82
jemrobinson
closed
5 years ago
2
ReadabiliPy vs Readability.js
#81
kjoshi
opened
5 years ago
2
Fix element replacement issue.
#80
jemrobinson
closed
5 years ago
1
Cannot replace an element with its contents
#79
jemrobinson
closed
5 years ago
0
Fix breitbart issue
#78
jemrobinson
closed
5 years ago
2
Crash when interpreting article from breitbart
#77
jemrobinson
closed
5 years ago
0
Go for the next highest scoring date when the first is not isoformat
#76
edwardchalstrey1
closed
5 years ago
1
add 2 extra date xpaths
#75
edwardchalstrey1
closed
5 years ago
1
add extra supported iso date format
#74
edwardchalstrey1
closed
5 years ago
1
Simplify benchmarking with containers folder
#73
edwardchalstrey1
closed
5 years ago
1
Fix empty pages
#72
jemrobinson
closed
5 years ago
1
Fix return value for empty pages
#71
jemrobinson
closed
5 years ago
0
Add support for isoformat dates with microseconds
#70
edwardchalstrey1
closed
5 years ago
0
Trigger coveralls upload
#69
jemrobinson
closed
5 years ago
0
Updated date extraction logic
#68
jemrobinson
closed
5 years ago
0
Fix potential issue in date extraction
#67
jemrobinson
closed
5 years ago
0
Added coveralls support
#66
jemrobinson
closed
5 years ago
0
Add test coverage badge
#65
jemrobinson
closed
5 years ago
0
Date extraction fix(es)
#64
edwardchalstrey1
closed
5 years ago
1
Make date extraction more robust
#63
jemrobinson
closed
5 years ago
1
Add benchmarking
#62
edwardchalstrey1
closed
5 years ago
0
Add benchmarking
#61
edwardchalstrey1
closed
5 years ago
0
Next