issues
search
tatuylonen
/
wikitextprocessor
Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. For data extraction, bulk syntax checking, error detection, and offline formatting.
Other
93
stars
23
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[nl] Change 'ref' HTML 'category' to '*'
#323
kristian-clausal
opened
9 hours ago
0
Bump crate-ci/typos from 1.24.1 to 1.25.0
#322
dependabot[bot]
opened
11 hours ago
0
Allow "div" tag inside "ref" tag
#321
xxyzz
closed
12 hours ago
2
Implement "int" parser function
#320
xxyzz
closed
14 hours ago
2
Only parse section node if the token is at the beginning of a line
#319
xxyzz
closed
4 days ago
1
Allow `Wtp.analyze_templates()` to accept a function parameter
#318
xxyzz
closed
4 days ago
0
Change to Python 3.10
#317
kristian-clausal
closed
4 days ago
1
Proposal: move analyze template code to wiktextract extractor code
#316
xxyzz
closed
4 days ago
1
Handle includeonly elements
#315
kristian-clausal
closed
5 days ago
2
Newlines inside `includeonly` are expanded on our side, but not in wikitext.
#314
kristian-clausal
closed
5 days ago
8
Expansion goes into an infinite loop with a certain template
#313
kristian-clausal
closed
6 days ago
2
Update Korean and Dutch Wiktionary namespace JSON files
#312
xxyzz
closed
1 week ago
0
Don't let section start token break unclosed parent node
#311
xxyzz
closed
1 week ago
0
Section node in template parameter shouldn't break parsing template node
#310
xxyzz
closed
1 week ago
6
Return empty string for `{{ns:0}}` and `{{ns:}}`
#309
xxyzz
closed
2 weeks ago
1
Remove unused Chinese Wiktionary analyze template code
#308
xxyzz
closed
2 weeks ago
0
Update simple English Wiktionary namespace file
#307
xxyzz
closed
2 weeks ago
0
Node to wikitext
#306
kristian-clausal
closed
3 weeks ago
0
Support upper case `#default` branch in `#switch` parser function
#305
xxyzz
closed
3 weeks ago
0
Only parse external link as text if `<nowiki/>` directly after `[`
#304
xxyzz
closed
3 weeks ago
0
Implement `mw.loadJsonData()`
#303
xxyzz
closed
1 month ago
0
Don't add empty argument to parser functions in `to_wikitext()`
#302
xxyzz
closed
1 month ago
1
Error: bad argument #1 for 'gsub' (string is not UTF-8)
#301
LeMoussel
closed
1 month ago
52
Bump crate-ci/typos from 1.23.1 to 1.24.1
#300
dependabot[bot]
closed
1 month ago
1
HTML tags in template arguments are parsed as plain text
#299
xxyzz
closed
1 month ago
9
`{{43e}}` not expanded
#298
LeMoussel
closed
1 month ago
2
Call `Wtp.preprocess_text()` again after calling `Wtp.expand()` in `Wtp.parse()`
#297
xxyzz
closed
2 months ago
7
Update Polish Wiktionary namespace file
#296
xxyzz
closed
2 months ago
0
Bump crate-ci/typos from 1.22.3 to 1.23.1
#295
dependabot[bot]
closed
3 months ago
1
Mypystuff
#294
kristian-clausal
closed
3 months ago
0
Set empty string as Lua title object `fragment` field default value
#293
xxyzz
closed
3 months ago
0
Use connection context manager in `get_entity_data()`
#292
xxyzz
closed
3 months ago
0
Only parse `----` as horizontal rule if it's at the start of line
#291
xxyzz
closed
3 months ago
1
Implement `#rel2abs` parser function
#290
xxyzz
closed
3 months ago
1
Bump crate-ci/typos from 1.21.0 to 1.22.3
#289
dependabot[bot]
closed
3 months ago
1
Update XML dump file namespace version
#288
xxyzz
closed
4 months ago
9
Remove upper case substitution modifiers
#287
xxyzz
closed
4 months ago
1
Update el edition namespace data
#286
xxyzz
closed
4 months ago
0
Change external links `[...]` regex
#285
kristian-clausal
closed
4 months ago
2
Update some namespace file and Scribunto git submodule
#284
xxyzz
closed
4 months ago
0
Update example usage code
#283
xxyzz
closed
4 months ago
0
assert error at src/parse.py ln 2287
#282
kylefoley76
closed
4 months ago
2
Don't add too many template args debug message for empty args
#281
xxyzz
closed
5 months ago
0
Bump crate-ci/typos from 1.20.4 to 1.21.0
#280
dependabot[bot]
closed
5 months ago
1
Use bz2 Python library if `lbzcat` and `bzcat` are not installed
#279
xxyzz
closed
5 months ago
0
Fix `dl` HTML tags can't have other HTML children bug
#278
xxyzz
closed
5 months ago
0
Remove the limit of unnamed template argument number
#277
xxyzz
closed
5 months ago
2
Unescape "*" to "*" in `mw.uri.anchorEncode()`
#276
xxyzz
closed
5 months ago
0
Add GH issue and Wiktionary links to `test_italics_in_table_header`
#275
xxyzz
closed
5 months ago
0
Fix an issue with TOKEN_RE
#274
kristian-clausal
closed
5 months ago
0
Next