issues
search
bencabrera
/
grawitas
Grawitas is a lightweight, fast parser for Wikipedia talk pages that takes the raw Wikipedia-syntax and outputs the structured content in various formats.
MIT License
7
stars
5
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Update README.md
#21
sen-choi
opened
2 months ago
0
Split by date
#20
lewishamulton
closed
2 years ago
0
Fixes infinite loop error in cli_crawler
#19
lewishamulton
opened
2 years ago
0
Underscore ('_') in page titles for cli_crawler
#18
lewishamulton
opened
2 years ago
0
Unable to install on Ubuntu 20.04
#17
santoshbs
opened
3 years ago
0
Double quotes in user names
#16
bjoernross
opened
4 years ago
0
Improve documentation for Ubuntu 18.04
#15
bonartm
opened
5 years ago
0
Crawler failing with <unspecified file>(1): expected value
#14
jrymart
closed
6 years ago
3
Parsing user talk pages
#13
TevenLeScao
opened
6 years ago
7
How to pass the extracted talk file to grawitas_cli_core
#12
AiliAili
closed
6 years ago
2
db-to-csv comment list output
#11
TevenLeScao
closed
6 years ago
2
Underscores and letter case in user names
#10
bjoernross
closed
6 years ago
0
Multilingual support
#9
matanox
opened
6 years ago
7
Windows 10 64x Error nach Crawler Component (GML)
#8
davidkhano
closed
6 years ago
3
error: Failure "Es konnte keine SSL-Kontextstruktur erzeugt werden<>"
#7
julbre
closed
6 years ago
1
Add a kind of sanity check for the input of the crawler component
#6
bencabrera
closed
6 years ago
2
CLI Crawler does currently ignore the input regarding output format and only outputs COMMENT_LIST_JSON format
#5
bencabrera
closed
7 years ago
0
Large parts of Talk:Capitalism seem not to be parsed at all
#4
bencabrera
closed
7 years ago
0
Whitespaces in usernames are lost
#3
bencabrera
closed
7 years ago
0
Necessary changes to make code compile
#2
bjoernross
closed
7 years ago
1
Make the xml parser stop if all talk pages in the article list were parsed already
#1
bencabrera
closed
7 years ago
0