issues
search
antoine-tran
/
boilerpipe
Automatically exported from code.google.com/p/boilerpipe
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Output as JSON
#56
GoogleCodeExporter
opened
9 years ago
0
Can not parse NYtimes pages
#55
GoogleCodeExporter
opened
9 years ago
2
Web api codes?
#54
GoogleCodeExporter
opened
9 years ago
0
Incorrect characters in Extractor output
#53
GoogleCodeExporter
opened
9 years ago
4
Please push 1.2 to maven central
#52
GoogleCodeExporter
opened
9 years ago
0
No tag in svn for 1.2?
#51
GoogleCodeExporter
opened
9 years ago
0
StackOverflowError when page includes another <body> part in <noframes>
#50
GoogleCodeExporter
opened
9 years ago
2
Article Image
#49
GoogleCodeExporter
opened
9 years ago
0
hybrid extractor?
#48
GoogleCodeExporter
opened
9 years ago
0
Errors deploying to Android
#47
GoogleCodeExporter
opened
9 years ago
0
Library does not produce same results as http://boilerpipe-web.appspot.com/
#46
GoogleCodeExporter
opened
9 years ago
5
Ignore FORM tags in HTMLHighlighter
#45
GoogleCodeExporter
closed
9 years ago
1
Ignore FORM tags in HTMLHighlighter
#44
GoogleCodeExporter
opened
9 years ago
3
DocumentTitleMatchClassifier should include the « and • characters
#43
GoogleCodeExporter
opened
9 years ago
0
Patch for /trunk/boilerpipe-core/src/main/de/l3s/boilerpipe/filters/heuristics/DocumentTitleMatchClassifier.java
#42
GoogleCodeExporter
closed
9 years ago
1
Title detection: Treat non-breaking space as whitespace
#41
GoogleCodeExporter
closed
9 years ago
6
Patch for /trunk/boilerpipe-core/src/main/de/l3s/boilerpipe/sax/DefaultTagActionMap.java
#40
GoogleCodeExporter
closed
9 years ago
1
Patch for /trunk/boilerpipe-core/src/main/de/l3s/boilerpipe/sax/CommonTagActions.java
#39
GoogleCodeExporter
closed
9 years ago
1
Patch for /trunk/boilerpipe-core/src/main/de/l3s/boilerpipe/sax/BoilerpipeHTMLContentHandler.java
#38
GoogleCodeExporter
closed
9 years ago
2
timeout and fallback strategy for boilerpipe
#37
GoogleCodeExporter
closed
9 years ago
6
ImageExtractor doesn't detect alternative images for Object plugins
#36
GoogleCodeExporter
closed
9 years ago
1
word counting code does not account for & being special html symbol.
#35
GoogleCodeExporter
closed
9 years ago
2
Add 'getInstance' accessor for ImageExtractor
#34
GoogleCodeExporter
closed
9 years ago
2
Bad xml format in html output from Web API
#33
GoogleCodeExporter
opened
9 years ago
1
Documentation - How to output html extract fragement instead of text?
#32
GoogleCodeExporter
closed
9 years ago
4
Support HTML5 elements
#31
GoogleCodeExporter
opened
9 years ago
2
Outputs html instead of plain text for certain urls
#30
GoogleCodeExporter
closed
9 years ago
2
boilerpipe crash
#29
GoogleCodeExporter
closed
9 years ago
1
UTF characters are not handled correctly
#28
GoogleCodeExporter
closed
9 years ago
3
Add 1.2.0 release to maven repository
#27
GoogleCodeExporter
closed
9 years ago
1
Tags missing in output html
#26
GoogleCodeExporter
closed
9 years ago
4
Feature Request - api to return character offsets of non-boilerplate text
#25
GoogleCodeExporter
closed
9 years ago
3
Boilepipe fails (but not web api edition)
#24
GoogleCodeExporter
closed
9 years ago
4
Encoding problem (input is interpreted as Latin-1)
#23
GoogleCodeExporter
closed
9 years ago
2
Page not being parsed correctly <li> the issue.
#22
GoogleCodeExporter
closed
9 years ago
9
Included nekhtml 1.9.9 mising LostText class
#21
GoogleCodeExporter
closed
9 years ago
2
Featurerequest: Run boilerpipe as a command line tool
#20
GoogleCodeExporter
opened
9 years ago
3
Code for Google app-engine?
#19
GoogleCodeExporter
opened
9 years ago
8
Description of different extractors?
#18
GoogleCodeExporter
closed
9 years ago
3
Precursory header tags missing
#17
GoogleCodeExporter
closed
9 years ago
3
Better support for non-english pages
#16
GoogleCodeExporter
opened
9 years ago
3
Title empty when parsing with TagSoup
#15
GoogleCodeExporter
opened
9 years ago
0
boilerpipe-web: Charset encoding problem
#14
GoogleCodeExporter
closed
9 years ago
3
Missing Maven dependency
#13
GoogleCodeExporter
opened
9 years ago
11
Possible improvement to TerminatingBlocksFinder
#12
GoogleCodeExporter
closed
9 years ago
1
Unconventional operator used for boolean logic
#11
GoogleCodeExporter
closed
9 years ago
3
Links on boilerpipe homepage are broken
#10
GoogleCodeExporter
closed
9 years ago
1
Add clone method to TextBlock
#9
GoogleCodeExporter
closed
9 years ago
2
Can you fix or promote the bug fix of NekoHTML (#2909310) ?
#8
GoogleCodeExporter
closed
9 years ago
2
Exclude Script tags
#7
GoogleCodeExporter
closed
9 years ago
3
Next