issues
search
appledora
/
mwparserfromhtml
An unofficial mirror of our repo of the `mwparserfromhtml` package. It is a python library for working with the HTML dumps. Since this is only a mirror, DO NOT PR.
https://pypi.org/project/mwparserfromhtml/
MIT License
4
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Some chane
#63
istiakshihab-office
closed
2 years ago
0
Allow raw article html strings to be passed without all the additional metadata in the dump
#46
appledora
opened
2 years ago
0
Make sure change to external links is not breaking
#45
appledora
opened
2 years ago
0
Add logging to indicate mismatch between HTML spec version and html dumps version
#44
appledora
opened
2 years ago
0
Ensure clear connection between HTML nodes and plaintext
#43
appledora
opened
2 years ago
0
Contribution Guideline and Tutorial Notebook
#42
appledora
opened
2 years ago
3
Handle inline transclusion differently in plaintext extraction
#41
appledora
opened
2 years ago
0
Split plaintext by sections and paragraphs
#40
appledora
opened
2 years ago
0
feature: metadata extraction - [merged]
#62
appledora
closed
2 years ago
19
additional metadata from json
#39
appledora
closed
2 years ago
1
Initial template for packaging - [merged]
#61
appledora
closed
2 years ago
6
Reduce down requirements.txt
#38
appledora
closed
2 years ago
1
Resolve "Create Documentation" - [merged]
#60
appledora
closed
2 years ago
57
feature: extract image audio and video media - [merged]
#59
appledora
closed
2 years ago
22
Create Documentation
#37
appledora
closed
2 years ago
1
feature: added namespace attribute to Wikilink instances, language attribute... - [merged]
#58
appledora
closed
2 years ago
11
Resolve "add functions to extract plaintexts to library" - [merged]
#57
appledora
closed
2 years ago
17
Resolve "add functions to extract plaintexts to library" - [closed]
#56
appledora
closed
2 years ago
3
Choose a license
#36
appledora
closed
2 years ago
1
Resolve "add function to extract references to library" - [merged]
#55
appledora
closed
2 years ago
11
determine how to identify hidden categories
#35
appledora
opened
2 years ago
0
reduce redundancy in testing module
#34
appledora
opened
2 years ago
0
discuss python packaging
#33
appledora
closed
2 years ago
6
utils function to identify the element type of html string
#32
appledora
closed
2 years ago
0
Write test for dump module
#31
appledora
opened
2 years ago
0
add functions to extract plaintexts to library
#30
appledora
closed
2 years ago
4
add functions to extract parents to library
#29
appledora
opened
2 years ago
0
add functions to extract ancestors to library
#28
appledora
opened
2 years ago
0
add functions to extract tables to library
#27
appledora
opened
2 years ago
0
add function to extract references to library
#26
appledora
closed
2 years ago
1
add function to extract media to library
#25
appledora
closed
2 years ago
1
write test for existing extraction method - [merged]
#54
appledora
closed
2 years ago
21
pretty print article information
#24
appledora
opened
2 years ago
0
write test for template extraction method
#23
appledora
closed
2 years ago
0
write test for header extraction method
#22
appledora
closed
2 years ago
0
write test for comment extraction method
#21
appledora
closed
2 years ago
0
Add namespace attribute to Wikilink objects
#20
appledora
closed
2 years ago
1
write test for section extraction method
#19
appledora
closed
2 years ago
1
write test for wikilinks extraction method
#18
appledora
closed
2 years ago
0
feature: template extraction method - [merged]
#53
appledora
closed
2 years ago
47
Consider removing specific versions from requirements.txt file
#17
appledora
closed
2 years ago
0
add static namespace list and utility for generating it to help with namespace... - [merged]
#52
appledora
closed
2 years ago
3
Write test for external links extraction method
#16
appledora
closed
2 years ago
0
Write test for category extraction method
#15
appledora
closed
2 years ago
0
Add tests to CI pipeline
#14
appledora
closed
2 years ago
3
feature: extract external links - [merged]
#51
appledora
closed
2 years ago
59
write function to create a hierarchy tree of the HTML tags
#13
appledora
opened
2 years ago
0
add function to extract templates to library
#12
appledora
closed
2 years ago
2
add funtion to extract external links to library
#11
appledora
closed
2 years ago
1
feature: extract categories and normalize category links - [merged]
#50
appledora
closed
2 years ago
44
Next